BLASTP 2.2.21+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for compositional score matrix adjustment: Stephen F. Altschul, John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis, Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109. RID: 3H8FBTXF013 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 9,090,450 sequences; 3,112,979,771 total letters Query= gi|71044445|gb|AAZ20765.1| glycosyltransferase [Escherichia coli] Length=331 Score E Sequences producing significant alignments: (Bits) Value gb|AAZ20765.1| glycosyltransferase [Escherichia coli] 666 0.0 ref|YP_002391830.1| hypothetical protein ECS88_2135 [Escheric... 186 3e-47 gb|AAX07749.1| unknown [Escherichia coli] 159 6e-39 ref|YP_001463381.1| hypothetical protein EcE24377A_2321 [Esch... 43.9 3e-04 ref|NP_418086.1| lipopolysaccharide core biosynthesis protein... 40.8 0.002 ref|YP_002331339.1| lipopolysaccharide core biosynthesis prot... 40.8 0.002 ALIGNMENTS >gb|AAZ20765.1| glycosyltransferase [Escherichia coli] Length=331 Score = 666 bits (1718), Expect = 0.0, Method: Compositional matrix adjust. Identities = 331/331 (100%), Positives = 331/331 (100%), Gaps = 0/331 (0%) Query 1 MIKTRENGFVSDVKNVLIFCPKFFNYDINIRNAVQKNGYTVSLFNERPFDNIIGRALIRL 60 MIKTRENGFVSDVKNVLIFCPKFFNYDINIRNAVQKNGYTVSLFNERPFDNIIGRALIRL Sbjct 1 MIKTRENGFVSDVKNVLIFCPKFFNYDINIRNAVQKNGYTVSLFNERPFDNIIGRALIRL 60 Query 61 GFDFFLKRQIFKYYKNIYDNIFADIDYLIVINPECITPEILEIYRNKCNNIIVYMWDSFE 120 GFDFFLKRQIFKYYKNIYDNIFADIDYLIVINPECITPEILEIYRNKCNNIIVYMWDSFE Sbjct 61 GFDFFLKRQIFKYYKNIYDNIFADIDYLIVINPECITPEILEIYRNKCNNIIVYMWDSFE 120 Query 121 NKPQAKKLIRHADLFYTFDPNDAKNNNIIFKPLFYTKAYREIPSKPQLEYDISFIGTIHS 180 NKPQAKKLIRHADLFYTFDPNDAKNNNIIFKPLFYTKAYREIPSKPQLEYDISFIGTIHS Sbjct 121 NKPQAKKLIRHADLFYTFDPNDAKNNNIIFKPLFYTKAYREIPSKPQLEYDISFIGTIHS 180 Query 181 SRYQYVKSIATVKNKFIFFYCPSLFVFLFKKYIAREIECIKLKDVSFKSLSESEVLNVIK 240 SRYQYVKSIATVKNKFIFFYCPSLFVFLFKKYIAREIECIKLKDVSFKSLSESEVLNVIK Sbjct 181 SRYQYVKSIATVKNKFIFFYCPSLFVFLFKKYIAREIECIKLKDVSFKSLSESEVLNVIK 240 Query 241 RSRCILDVVHPKQNGLTIRTIEALGANKKIITTNRNVVKYDFYNPSNILVLDDTVNDVAI 300 RSRCILDVVHPKQNGLTIRTIEALGANKKIITTNRNVVKYDFYNPSNILVLDDTVNDVAI Sbjct 241 RSRCILDVVHPKQNGLTIRTIEALGANKKIITTNRNVVKYDFYNPSNILVLDDTVNDVAI 300 Query 301 AKFINQEYVHPDDEIYQSYYIENWVKDLLRE 331 AKFINQEYVHPDDEIYQSYYIENWVKDLLRE Sbjct 301 AKFINQEYVHPDDEIYQSYYIENWVKDLLRE 331 >ref|YP_002391830.1| hypothetical protein ECS88_2135 [Escherichia coli S88] emb|CAN87669.1| conserved hypothetical protein [Escherichia coli] emb|CAR03422.1| conserved hypothetical protein [Escherichia coli S88] Length=332 Score = 186 bits (472), Expect = 3e-47, Method: Compositional matrix adjust. Identities = 122/328 (37%), Positives = 187/328 (57%), Gaps = 23/328 (7%) Query 16 VLIFCPKFFNYDINIRNAVQKNGYTVSLFNERPFDNIIGRALIRLGFDFFLKRQIFKYYK 75 VL PKFFNY+ I ++ V ++ERP +N I ++L+RLG +K+ YYK Sbjct 7 VLFISPKFFNYEKEIVKELELTN-DVIYWDERPSNNAIYKSLLRLGCSILIKKYNSSYYK 65 Query 76 NIYDNIF-ADIDYLIVINPECITPEILEI------YRNKCNNIIVYMWDSFENKPQAKKL 128 + I ID + ++NPE I EIL+ +N+ I+Y+WDS NKP+ K++ Sbjct 66 KLLRQIVNKKIDNVFILNPEAIDHEILQAIKKTVKVKNEHCKFIMYLWDSVNNKPKVKRI 125 Query 129 IRHADLFYTFDPNDAKNNNIIFKPLFYTKAYREIPSKPQLEYDISFIGTIHSSRYQYVKS 188 I D +TF+ +DA+ + F PLFY+ + + + L+YD+ FIGT HS R V Sbjct 126 INQFDSVFTFERDDAEKWGVTFLPLFYSPRF-DNKHEHSLKYDLCFIGTAHSDRVDLVNC 184 Query 189 I-ATVKN-----KFIFFYCPSLFVFLFKKYIA-REIECIKLKDVSFKSLSESEVLNVIKR 241 I A V+N + FFY + +F K ++ ++I DV F+ LS+SEV+ ++K Sbjct 185 ITAEVRNIKDINAYTFFYFQNEIIFKLKNLLSGKKINT----DVEFEPLSQSEVVEMMKC 240 Query 242 SRCILDVVHPKQNGLTIRTIEALGANKKIITTNRNVVKYDFYNPSNILVLDDTVNDVAIA 301 S I+D+ HP+Q GLT+RTIE +G NKKIITTN ++ KYDFY+P+ I V+ + Sbjct 241 SEIIIDIHHPRQRGLTMRTIECIGLNKKIITTNEDIKKYDFYDPNMICVVKRH-QVIVPE 299 Query 302 KFINQEYVHPDDEIYQSYYIENWVKDLL 329 FIN V+ + + Y ++ WVK ++ Sbjct 300 NFINAPNVNYKNR--EQYSLQQWVKKII 325 >gb|AAX07749.1| unknown [Escherichia coli] Length=336 Score = 159 bits (401), Expect = 6e-39, Method: Compositional matrix adjust. Identities = 109/327 (33%), Positives = 172/327 (52%), Gaps = 17/327 (5%) Query 16 VLIFCPKFFNYDINIRNAVQKNGYTVSLFNERPFDNIIGRALIRLG-FDFFLKRQIFKYY 74 VL CP+FFNY+ I + +++ G TV F+E+PF+N+ + L+RL + F+KR Y+ Sbjct 3 VLFLCPRFFNYENEISDGLRRTGATVDYFDEKPFNNVFFKILLRLWKGNNFIKRISDAYF 62 Query 75 KNIYDNIFADIDYLIVINPECITPEILEIYRNKCNN--IIVYMWDSFENKPQAKKLIRHA 132 + I D DY+IV+ E + + L ++NK N I Y WDS +N P ++ + Sbjct 63 EKILLQTNDDYDYVIVLKGESLDRKNLLKFKNKYKNAKFIYYAWDSIKNYPHIQECLNLF 122 Query 133 DLFYTFDPNDAKNNNIIFK-PLFYTKAYREIPSKPQ---LEYDISFIGTIHSSRYQYVKS 188 D +TFD NDA+ + + PLFY+ + K + I+F+GT+HS RY+ + Sbjct 123 DRVFTFDDNDAREYDFMTHLPLFYSPDFVSTAKKEASKNFKPSIAFLGTVHSDRYRVLGE 182 Query 189 I-ATVKNKF---IFFYCPSLFV---FLFKKYIAREIECIKLKDVSFKSLSESEVLNVIKR 241 + KN++ Y PS+ V FL + + I KL + +S S+ ++ + Sbjct 183 VYEKYKNEYDLRFVLYFPSIVVLVGFLLTNF--KSIIRFKLFSFTLRSRSKKQIASFFSS 240 Query 242 SRCILDVVHPKQNGLTIRTIEALGANKKIITTNRNVVKYDFYNPSNILVLDDTVNDVAIA 301 + +LD+ HP+Q GLT+RTIE L +K ITTN V YDFY+ N +D N + + Sbjct 241 ADAVLDIQHPRQTGLTMRTIECLPLKRKFITTNSRVKNYDFYSAENFYFIDRD-NILIDS 299 Query 302 KFINQEYVHPDDEIYQSYYIENWVKDL 328 F Y + Y I++WVK L Sbjct 300 DFFEIPYNDAHLDAISRYSIDSWVKTL 326 >ref|YP_001463381.1| hypothetical protein EcE24377A_2321 [Escherichia coli E24377A] gb|ABV20316.1| conserved hypothetical protein [Escherichia coli E24377A] Length=295 Score = 43.9 bits (102), Expect = 3e-04, Method: Compositional matrix adjust. Identities = 30/103 (29%), Positives = 60/103 (58%), Gaps = 4/103 (3%) Query 229 SLSESEVLNVIKRSRCILDVVHPKQNGLTIRTIEALGANKKIITTNRNVVKYDFYNPSNI 288 ++S E +N ++ ++ IL+V Q GLT+R++E++ ++K+I+ N +++ DFY+ S I Sbjct 193 AISYKENINKVQENKYILEVNTVGQVGLTLRSLESIFYSRKLISNNIDLMNCDFYDKSRI 252 Query 289 LVLDDTVNDVAIA---KFINQEYVHPDDEIYQSYYIENWVKDL 328 + ++ ND+ + +FI Y ++ I + Y N K L Sbjct 253 FIFNNP-NDLFSSDFDEFIRLPYKPVENVILEKYKASNIFKKL 294 >ref|NP_418086.1| lipopolysaccharide core biosynthesis protein [Escherichia coli str. K-12 substr. MG1655] ref|AP_004162.1| lipopolysaccharide core biosynthesis protein [Escherichia coli str. K-12 substr. W3110] ref|YP_001732457.1| lipopolysaccharide core biosynthesis protein [Escherichia coli str. K-12 substr. DH10B] ref|YP_002928516.1| lipopolysaccharide core biosynthesis protein [Escherichia coli BW2952] sp|P27126.1|RFAS_ECOLI RecName: Full=Lipopolysaccharide core biosynthesis protein rfaS gb|AAA24084.1| lipopolysaccharide core biosynthesis protein [Escherichia coli] gb|AAB18606.1| rfaS [Escherichia coli str. K-12 substr. MG1655] gb|AAC76653.1| lipopolysaccharide core biosynthesis protein [Escherichia coli str. K-12 substr. MG1655] dbj|BAE77663.1| lipopolysaccharide core biosynthesis protein [Escherichia coli str. K12 substr. W3110] gb|ACB04679.1| lipopolysaccharide core biosynthesis protein [Escherichia coli str. K12 substr. DH10B] gb|ACR63662.1| lipopolysaccharide core biosynthesis protein [Escherichia coli BW2952] Length=311 Score = 40.8 bits (94), Expect = 0.002, Method: Compositional matrix adjust. Identities = 32/93 (34%), Positives = 46/93 (49%), Gaps = 4/93 (4%) Query 228 KSLSESEVLNVIKRSRCILDVVHPKQNGLTIRTIEALGANKKIITTNRNVVKYDFYNPSN 287 K +S E + + I+D+ Q+G T+R +EAL NKK+IT N NV + Y+ S Sbjct 205 KQISYEENIRRTLNANIIVDITKENQSGWTLRILEALFFNKKLITNNINVFGSEIYSESR 264 Query 288 ILVLDDTVNDVAIAKFINQEYVHPDDEIYQSYY 320 ++ D + FIN V P D Y S Y Sbjct 265 FFIIGHDDWD-KLEYFINSS-VKPMD--YDSLY 293 >ref|YP_002331339.1| lipopolysaccharide core biosynthesis protein [Escherichia coli O127:H6 str. E2348/69] emb|CAS11425.1| lipopolysaccharide core biosynthesis protein [Escherichia coli O127:H6 str. E2348/69] Length=311 Score = 40.8 bits (94), Expect = 0.002, Method: Compositional matrix adjust. Identities = 29/79 (36%), Positives = 42/79 (53%), Gaps = 4/79 (5%) Query 242 SRCILDVVHPKQNGLTIRTIEALGANKKIITTNRNVVKYDFYNPSNILVLDDTVNDVAIA 301 + I+D+ Q+G T+R +EAL NKK+IT N NV+ + Y+ S ++ D + Sbjct 219 ANIIVDITKENQSGWTLRILEALFFNKKLITNNINVLGSEIYSESRFFIIGHDDWD-KLE 277 Query 302 KFINQEYVHPDDEIYQSYY 320 FIN V P D Y S Y Sbjct 278 YFINSS-VKPMD--YDSLY 293 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects Posted date: Jun 16, 2009 5:41 PM Number of letters in database: 26,573,871 Number of sequences in database: 84,272 Lambda K H 0.325 0.143 0.427 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 84272 Number of Hits to DB: 2337059 Number of extensions: 94338 Number of successful extensions: 242 Number of sequences better than 0.1: 0 Number of HSP's better than 0.1 without gapping: 0 Number of HSP's gapped: 242 Number of HSP's successfully gapped: 0 Length of query: 331 Length of database: 26573871 Length adjustment: 104 Effective length of query: 227 Effective length of database: 17809583 Effective search space: 4042775341 Effective search space used: 4042775341 T: 11 A: 40 X1: 15 (7.0 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 40 (20.0 bits) S2: 80 (35.4 bits)