Examples of Alignment Formats education



The amino acid sequence of MJ0577 was used as a query against the E. coli entries of the nr database. The E value threshold was set to 0.01. Four sequences were returned as shown below. Results were aligned using a few of the various formatting options (pairwise; flat query-anchored without identities; flat query-anchored with identities) available.
                                                                  Score     E
Sequences producing significant alignments:                        (bits)  Value

gi|2507517|sp|P39177|UP12_ECOLI  UNKNOWN PROTEIN FROM 2D-PAG...    55  6e-09
gi|2507516|sp|P37903|UP03_ECOLI  UNKNOWN PROTEIN 2D_000B3L F...    41  1e-04
gi|7429293|pir||C64888  conserved hypothetical protein b1376...    41  1e-04
gi|2507514|sp|P03807|YDAA_ECOLI  35.6 KDA PROTEIN IN TPX-FNR...    38  0.002

Pairwise Alignments

           

 gi|4062229|dbj|BAA35246.1| (D90702) Unknown protein from 2D-page (spots pr25/lm16/2d_000lr3) .
           [Escherichia coli]
          Length = 142
                    
 Score = 55.4 bits (131), Expect = 6e-09
 Identities = 44/157 (28%), Positives = 79/157 (50%), Gaps = 17/157 (10%)

Query: 4   MYKKILYPTDFSETAEIALKHVKAFKTLKAEEVI--LLHVIDEREIKKRDIFSLLLGVAG 61
           MYK I+ P D  E  E++ K V+  + L  ++ +  LLHV+           S  L +  
Sbjct: 1   MYKTIIMPVDVFEM-ELSDKAVRHAEFLAQDDGVIHLLHVLPG---------SASLSLHR 50

Query: 62  LNKSVEEFENELKNKLTEEAKNKMENIKKELEDVGFKVKDIIVVGIPHEEIVKIAEDEGV 121
               V  FE  L++    EA+ +++ +         ++K  +  G   +E+ ++AE+ G 
Sbjct: 51  FAADVRRFEEHLQH----EAQERLQTMVSHFTIDPSRIKQHVRFGSVRDEVNELAEELGA 106

Query: 122 DIIIMGSHGKTNLKEILLGSVTENVIKKSNKPVLVVK 158
           D++++GS    ++   LLGS   +VI+ +N PVLVV+
Sbjct: 107 DVVVIGSR-NPSISTHLLGSNASSVIRHANLPVLVVR 142
sp|P37903|UP03_ECOLI UNKNOWN PROTEIN 2D_000B3L FROM 2D-PAGE
 dbj|BAA14980.1| (D90775) Unknown protein from 2D-PAGE (SPOT 2D_000B3L) (fragment).
           [Escherichia coli]
          Length = 144

 Score = 41.0 bits (94), Expect = 1e-04
 Identities = 43/157 (27%), Positives = 72/157 (45%), Gaps = 15/157 (9%)

Query: 4   MYKKILYPTDFS--ETAEIALKHVKAFKTLKAEEVILLHVIDEREIKKRDIFSLLLGVAG 61
           M + IL P D S  E  +  + HV+    +   EV  L VI          +   LG+A 
Sbjct: 1   MNRTILVPIDISDSELTQRVISHVEEEAKIDDAEVHFLTVIPSLP------YYASLGLA- 53

Query: 62  LNKSVEEFENELKNKLTEEAKNKMENIKKELEDVGFKVKDIIVVGIPHEEIVKIAEDEGV 121
                   E    + L  EAK+++E I K+ +    +V   +  G P + I+++A+    
Sbjct: 54  -----YSAELPAMDDLKAEAKSQLEEIIKKFKLPTDRVHVHVEEGSPKDRILELAKKIPA 108

Query: 122 DIIIMGSHGKTNLKEILLGSVTENVIKKSNKPVLVVK 158
            +II+ SH + ++   LLGS    V++ +   VLVV+
Sbjct: 109 HMIIIASH-RPDITTYLLGSNAAAVVRHAECSVLVVR 144
pir||C64888 conserved hypothetical protein b1376 - Escherichia coli
 gb|AAC74458.1| (AE000234) putative filament protein [Escherichia coli]
          Length = 168

 Score = 41.0 bits (94), Expect = 1e-04
 Identities = 43/157 (27%), Positives = 72/157 (45%), Gaps = 15/157 (9%)

Query: 4   MYKKILYPTDFS--ETAEIALKHVKAFKTLKAEEVILLHVIDEREIKKRDIFSLLLGVAG 61
           M + IL P D S  E  +  + HV+    +   EV  L VI          +   LG+A 
Sbjct: 25  MNRTILVPIDISDSELTQRVISHVEEEAKIDDAEVHFLTVIPSLP------YYASLGLA- 77

Query: 62  LNKSVEEFENELKNKLTEEAKNKMENIKKELEDVGFKVKDIIVVGIPHEEIVKIAEDEGV 121
                   E    + L  EAK+++E I K+ +    +V   +  G P + I+++A+    
Sbjct: 78  -----YSAELPAMDDLKAEAKSQLEEIIKKFKLPTDRVHVHVEEGSPKDRILELAKKIPA 132

Query: 122 DIIIMGSHGKTNLKEILLGSVTENVIKKSNKPVLVVK 158
            +II+ SH + ++   LLGS    V++ +   VLVV+
Sbjct: 133 HMIIIASH-RPDITTYLLGSNAAAVVRHAECSVLVVR 168
gi|2507514|sp|P03807|YDAA_ECOLI 35.6 KDA PROTEIN IN TPX-FNR INTERGENIC REGION
 gi|7429180|pir||QQECX membrane protein ydaA - Escherichia coli
 gi|1742190|dbj|BAA14926.1| (D90771) ORF_ID:o261#5~similar to [SwissProt Accession Number
           P44195] [Escherichia coli]
 gi|1742201|dbj|BAA14936.1| (D90772) ORF_ID:o261#5~similar to [SwissProt Accession Number
           P44195] [Escherichia coli]
 gi|1787594|gb|AAC74415.1| (AE000231) orf, hypothetical protein [Escherichia coli]
          Length = 316

 Score = 37.5 bits (85), Expect = 0.002
 Identities = 18/53 (33%), Positives = 29/53 (53%)

Query: 106 GIPHEEIVKIAEDEGVDIIIMGSHGKTNLKEILLGSVTENVIKKSNKPVLVVK 158
           G+P E I  +AE     I+++G+ G+T +    LG+  E VI      +LV+K
Sbjct: 248 GLPEEVIPDLAEHLQAGIVVLGTVGRTGISAAFLGNTAEQVIDHLRCDLLVIK 300

Flat Query-Anchored with Identities


In the flat query-anchored with identities format, dots are used in place of residues in the database hits that are identical to the aligned residues in the query sequence. This format is useful for rapid inspection of aligned sequences for conserved residues. Dashes in the sequence represent gaps.


blast_tmp 4 MYKKILYPTDFS--ETAEIALKHVKAFKTLKAEEVI--LLHVIDEREIKKRDIFSLLLGV 59 1778525 1 ...T.IM.V.VF--.M-.LSD.A.RHAEF.AQDDGVIH....LPG---------.AS.SL 48 1742248 1 .NRT..V.I.I.DS.LTQRVIS..EEEAKIDDA..H--F.T..PSLP------YYAS..L 52 1787640 25 .NRT..V.I.I.DS.LTQRVIS..EEEAKIDDA..H--F.T..PSLP------YYAS..L 76 blast_tmp 60 AGLNKSVEEFENELKNKLTEEAKNKMENIKKELEDVGFKVKDIIVVGIPHEEIVKIAEDE 119 1778525 49 HRFAAD.RR..EH.QH----..QERLQTMVSHFTIDPSRI.QHVRF.SVRD.VNEL..EL 104 1742248 53 .------YSA.LPAMDD.KA...SQL.E.I.KFKLPTDR.HVHVEE.S.KDR.LEL.KKI 106 1787640 77 .------YSA.LPAMDD.KA...SQL.E.I.KFKLPTDR.HVHVEE.S.KDR.LEL.KKI 130 7429180 248 .L.E.V.PDL..HL 261 blast_tmp 120 GVDIIIMGSHGKTNLKEILLGSVTENVIKKSNKPVLVVK 158 1778525 105 .A.VVVI..R-NPSISTH....NASS..RHA.L.....R 142 1742248 107 PAHM..IA..-RPDITTY....NAAA.VRHAECS....R 144 1787640 131 PAHM..IA..-RPDITTY....NAAA.VRHAECS....R 168 7429180 262 QAG.VVL.TV.R.GISAAF..NTA.Q..DHLRCDL..I. 300

Flat Query-Anchored without Identities

In the flat query-anchored without identities format, all residues (both identical and non-identical) in the datbase hits are represented below the appropriate residues of the query sequence.

blast_tmp 4   MYKKILYPTDFS--ETAEIALKHVKAFKTLKAEEVI--LLHVIDEREIKKRDIFSLLLGV 59
1778525   1   MYKTIIMPVDVF--EM-ELSDKAVRHAEFLAQDDGVIHLLHVLPG---------SASLSL 48
1742248   1   MNRTILVPIDISDSELTQRVISHVEEEAKIDDAEVH--FLTVIPSLP------YYASLGL 52
7429293   25  MNRTILVPIDISDSELTQRVISHVEEEAKIDDAEVH--FLTVIPSLP------YYASLGL 76

blast_tmp 60  AGLNKSVEEFENELKNKLTEEAKNKMENIKKELEDVGFKVKDIIVVGIPHEEIVKIAEDE 119
1778525   49  HRFAADVRRFEEHLQH----EAQERLQTMVSHFTIDPSRIKQHVRFGSVRDEVNELAEEL 104
1742248   53  A------YSAELPAMDDLKAEAKSQLEEIIKKFKLPTDRVHVHVEEGSPKDRILELAKKI 106
7429293   77  A------YSAELPAMDDLKAEAKSQLEEIIKKFKLPTDRVHVHVEEGSPKDRILELAKKI 130
1742201   248                                               GLPEEVIPDLAEHL 261

blast_tmp 120 GVDIIIMGSHGKTNLKEILLGSVTENVIKKSNKPVLVVK 158
1778525   105 GADVVVIGSR-NPSISTHLLGSNASSVIRHANLPVLVVR 142
1742248   107 PAHMIIIASH-RPDITTYLLGSNAAAVVRHAECSVLVVR 144
7429293   131 PAHMIIIASH-RPDITTYLLGSNAAAVVRHAECSVLVVR 168
1742201   262 QAGIVVLGTVGRTGISAAFLGNTAEQVIDHLRCDLLVIK 300