Gene ECH74115_4975 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagECH74115_4975 
Symbol 
ID6970910 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli O157:H7 str. EC4115 
KingdomBacteria 
Replicon accessionNC_011353 
Strand
Start bp4622213 
End bp4626979 
Gene Length4767 bp 
Protein Length1588 aa 
Translation table11 
GC content50% 
IMG OID643388657 
ProductHep_Hag family protein 
Protein accessionYP_002273084 
Protein GI209397689 
COG category[U] Intracellular trafficking, secretion, and vesicular transport
[W] Extracellular structures 
COG ID[COG5295] Autotransporter adhesin 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones13 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones59 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAACAAAA TATTTAAAGT TATCTGGAAC CCTGCGACAG GGAATTATAC TGTTACCAGC 
GAAACGGCAA AAAGCCGTGG CAAGAAATCT GGGCGCAGTA AGCTGTTAAT TTCTGCGCTG
GTTGCGGGTG GAATGTTGTC GTCGTTTGGG GCATTGGCGA ATGCCGGGAA TGACAACGGT
CAGGGTGTTG ATTACGGTAG TGGATCAGCT GGCGACGGCT GGGTTGCTAT AGGCAAAGGG
GCGAAAGCAA ATACTTTTAT GAACACCAGT GGTTCCAGTA CTGCTGTGGG TTATGACGCT
ATAGCTGAAG GCCAATATAG CTCTGCCATC GGGTCAAAAA CCCATGCGAT TGGTGGTGCA
TCAATGGCCT TTGGGGTTAG TGCAATATCA GAAGGCGATA GAAGTATAGC ACTGGGTGCC
TCTTCGTATT CATTGGGCCA ATACTCAATG GCCCTCGGCC GTTATTCAAA AGCATTGGGT
AAATTGTCTA TTGCTATGGG GGACTCTTCC AAAGCGGAAG GAGCAAACGC CATTGCCCTG
GGAAATGCCA CTAAAGCTAC TGAGATTATG AGTATTGCTC TTGGCGACAC CGCCAATGCG
TCAAAAGCGT ATTCAATGGC GCTGGGAGCA AGTAGCGTCG CATCTGAAGA AAACGCTATT
GCGATAGGTG CTGAGACCGA AGCCGCTGAA AATGCAACTG CTATTGGCAA TAATGCGAAG
GCAAAAGGGA CTAATAGCAT GGCAATGGGG TTCGGAAGCC TTGCCGATAA AGTCAATACT
ATCGCATTAG GAAATGGCAG CCAGGCTCTG GCAGATAATG CAATCGCCAT AGGCCAGGGC
AACAAAGCTG ATGGCGTGGA TGCCATCGCT CTGGGTAATG GTAGCCAGTC GAGAGGCTTA
AACACCATTG CCTTAGGCAC AGCCAGTAAT GCAACTGGTG ATAAGAGTCT TGCGCTTGGT
AGTAATAGCA GTGCCAACGG TATTAACTCT GTCGCGCTGG GCGCAGATTC CATTGCGGAT
TTAGACAATA CCGTCTCTGT CGGCAATAGT TCATTAAAAC GCAAGATCGT TAATGTGAAA
AATGGCGCGA TCAAGTCTGA CAGTTACGAT GCCATTAATG GTTCACAGCT TTATGCCATT
AGCGACTCGG TAGCAAAAAG GCTTGGAGGA GGGGCTGCAG TAGATGTTGA TGACGGTACT
GTTACAGCAC CAACCTACAA TTTAAAAAAT GGTAGCAAAA ATAACGTAGG GGCTGCGCTC
GCTGTACTTG ATGAAAACAC CCTGCAATGG GACCAAACCA AAGGCAAATA CAGCGCTGCT
CATGGTACTA GTAGCCCAAC TGCCAGCGTA ATCACCGATG TTGCGGATGG CACGATTTCA
GCCTCCAGTA AGGATGCGGT TAACGGTTCC CAACTGAAAG CTACCAATGA CGATGTCGAA
GCCAACACCG CCAATATCGC TACTAATACC AGCAACATTG CCACGAATAC GGCAAATATT
GCCACCAATA CCACCAATAT CACCAACCTG ACGGATTCCG TTGGTGACCT TCAGGCTGAT
GCCCTGCTCT GGAACGAAAC TAAAAAGGCA TTCAGTGCAG CTCACGGCCA GGATACCACC
AGCAAAATCA CCAACGTTAA AGATGCCGAC CTGACGGCTG ACAGCACTGA TGCTGTTAAC
GGCTCTCAGC TGAAAACCAC CAACGATGCT GTGGCGACGA ATACCACCAA TATCGCCAAT
AACACTTCCA ATATTGCCAC TAACACCACC AACATCTCTA ACCTGACTGA GACGGTGACT
AATCTTGGTG AGGATGCGCT GAAATGGGAT AAGGACAATG GTGTATTCAC GGCAGCTCAT
GGCACCGAGA CCACCAGCAA AATCACCAAC GTTAAAGATG GCGACCTGAC GACTGGCAGC
ACCGATGCCG TTAACGGCTC TCAGCTGAAA ACCACCAACG ATGCCGTGGC GACGAATACC
ACCAATATCG CCACTAACAC CACCAACATC TCTAATCTGA CTGAGACGGT GACTAATCTT
GGTGAGGATG CGCTGAAATG GGATAAGGAC AATGGTGTCT TCACTGCAGC TCATGGCAAC
AATACCGCCA GCAAAATCAC CAATATCCTG GACGGCACAG TCACTGCAAC CAGTTCCGAT
GCCATTAACG GTAGCCAGCT TTATGACTTA AGCAGCAATA TCGCCACCTA CTTCGGCGGC
AATGCTTCTG TGAATACTGA CGGTGTGTTT ACCGGTCCAA CCTACAAAAT CGGTGAAACA
AATTATTATA ACGTCGGCGA TGCACTGGCT GCGATTAACT CCTCATTTAG CACGTCTCTC
GGCGATGCTC TGCTTTGGGA TGCCACCGCA GGTAAATTCA GTGCCAAACA CGGTACTAAT
GGTGACGCAA GCGTGATCAC TGATGTCGCA GATGGTGAAA TTTCAGACTC CAGTTCTGAC
GCAGTAAACG GCTCACAACT CCACGGCGTG AGCAGTTATG TTGTTGATGC GCTGGGGGGT
GGTGCCGAAG TCAATGCAGA CGGCACCATC ACTGCGCCGA CGTACACCAT TGCTAATGCT
GATTACGATA ATGTCGGTGA TGCCCTGAAT GCTATCGATA CCACTCTTGA CGACGCTCTG
CTCTGGGATG CGGACGCCGG TGAAAATGGT GCATTTAGCG CCGCTCACGG AAAAGATAAA
ACTGCCAGTG TAATCACTAA CGTCGCTAAC GGTGCAATCT CTGCTGCCAG CAGCGACGCG
ATTAACGGCT CACAACTCTA TACCACCAAT AAGTACATCG CTGATGCGCT GGGTGGTGAC
GCAGAAGTCA ACGCTGACGG CACCATCACC GCACCGACTT ACACCATTGC GAACGCCGAG
TACAACAACG TCGGTGACGC CCTGGATGCG CTTGATGATA ACGCCCTGCT GTGGGATGAG
ACTGCCAATG GCGGTGCTGG AGCCTACAAT GCCAGCCATG ACGGTAAAGC CAGCATCATC
ACTAATGTCG CTAATGGCAG TATTAGTGAG GACAGTACCG ATGCAGTGAA CGGTTCTCAG
TTGAATGCGA CGAATATGAT GATTGAGCAG AACACCCAAA TTATCAATCA GCTCGCTGGT
AACACCGACG CAACCTATAT CCAAGAAAAC GGTGCGGGTA TTAACTATGT GCGTACTAAC
GACGACGGCT TAGCGTTCAA CGACGCCAGC GCACAGGGTG TTGGCGCTAC AGCTATAGGT
TATAACTCTG TCGCCAAAGG CGATAGCAGC GTAGCTATTG GTCAGGGCAG CTACAGCGAC
GTTGATACGG GTATCGCCCT GGGTAGCAGC TCTGTTTCCA GCCGAGTGAT TGCCAAAGGC
TCCCGTGACA CCAGCATAAC GGAAAATGGC GTTGTTATTG GTTACGACAC CACGGATGGC
GAACTGCTCG GTGCATTGTC TATCGGTGAT GACGGTAAAT ATCGTCAAAT CATCAACGTA
GCCGATGGTT CCGAAGCCCA TGACGCCGTT ACGGTTCGTC AATTGCAGAA TGCGATTGGT
GCGGTCGCAA CCACGCCGAC TAAATACTTC CACGCTAATT CAACGGAAGA AGATTCACTG
GCAGTGGGAA CTGACTCGCT GGCAATGGGT GCGAAAACCA TCGTGAATGG CGATAAAGGT
ATTGGTATCG GTTATGGTGC CTACGTGGAC GCGAATGCAC TTAACGGCAT TGCCATTGGT
AGCAATGCGC AAGTCATTCA TGTCAACAGT ATTGCGATAG GTAATGGTTC TACGACCACT
CGTGGCGCTC AAACCAATTA TACCGCCTAC AACATGGACG CACCGCAGAA CTCTGTCGGT
GAATTCTCAG TCGGTAGTGC GGATGGTCAA CGTCAGATCA CTAACGTCGC AGCAGGTTCG
GCTGATACCG ATGCGGTCAA CGTGGGTCAG TTGAAAGTAA CGGATGCGCA GGTTTCCCAG
AATACCCAGA GCATTACTAA CCTGGATAAT CGGGTAACGA ATCTTGATTC ACGCGTCACC
AATATCGAAA ACGGTATTGG CGATATCGTC ACCACCGGTA GCACCAAGTA CTTCAAGACC
AATACCGATG GTGTAGATGC CAGCGCGCAG GGTAAAGATA GCGTCGCGAT TGGTTCCGGC
TCCATTGCTG CCGCTGACAA CAGCGTCGCT CTGGGTACAG GGTCTGTGGC AACCGAAGAA
AATACGATCT CTGTAGGTTC CTCTACTAAC CAACGTCGTA TCACCAACGT AGCTGCAGGT
AAAAATGCTA CCGATGCTGT TAACGTGGCA CAGTTGAAGT CTTCCGAAGC TGGCGGTGTA
CGTTACGACA CCAAAGCTGA TGGTTCTATC GACTATAGCA ATATCACCCT CGGTGGCGGC
AACGGCGGTA CGACTCGTAT CAGCAACGTC TCCGCTGGCG TCAACAACAA CGACGTGGTG
AATTACGCGC AGTTGAAGCA AAGCGTGCAG GAAACGAAGC AATACACCGA TCAGCGAATG
GTTGAGATGG ATAACAAACT GTCTAAAACT GAAAGCAAGT TGAGCGGTGG TATCGCTTCT
GCAATGGCAA TGACCGGTCT GCCGCAGGCT TACACTCCAG GTGCCAGCAT GGCCTCTATT
GGTGGCGGTA CTTACAACGG TGAATCGGCA GTTGCTTTAG GTGTATCGAT GGTGAGCGCC
AATGGTCGTT GGGTCTACAA ATTACAAGGT AGTACCAATA GCCAGGGTGA ATACTCCGCC
GCACTCGGTG CCGGTATTCA GTGGTAA
 
Protein sequence
MNKIFKVIWN PATGNYTVTS ETAKSRGKKS GRSKLLISAL VAGGMLSSFG ALANAGNDNG 
QGVDYGSGSA GDGWVAIGKG AKANTFMNTS GSSTAVGYDA IAEGQYSSAI GSKTHAIGGA
SMAFGVSAIS EGDRSIALGA SSYSLGQYSM ALGRYSKALG KLSIAMGDSS KAEGANAIAL
GNATKATEIM SIALGDTANA SKAYSMALGA SSVASEENAI AIGAETEAAE NATAIGNNAK
AKGTNSMAMG FGSLADKVNT IALGNGSQAL ADNAIAIGQG NKADGVDAIA LGNGSQSRGL
NTIALGTASN ATGDKSLALG SNSSANGINS VALGADSIAD LDNTVSVGNS SLKRKIVNVK
NGAIKSDSYD AINGSQLYAI SDSVAKRLGG GAAVDVDDGT VTAPTYNLKN GSKNNVGAAL
AVLDENTLQW DQTKGKYSAA HGTSSPTASV ITDVADGTIS ASSKDAVNGS QLKATNDDVE
ANTANIATNT SNIATNTANI ATNTTNITNL TDSVGDLQAD ALLWNETKKA FSAAHGQDTT
SKITNVKDAD LTADSTDAVN GSQLKTTNDA VATNTTNIAN NTSNIATNTT NISNLTETVT
NLGEDALKWD KDNGVFTAAH GTETTSKITN VKDGDLTTGS TDAVNGSQLK TTNDAVATNT
TNIATNTTNI SNLTETVTNL GEDALKWDKD NGVFTAAHGN NTASKITNIL DGTVTATSSD
AINGSQLYDL SSNIATYFGG NASVNTDGVF TGPTYKIGET NYYNVGDALA AINSSFSTSL
GDALLWDATA GKFSAKHGTN GDASVITDVA DGEISDSSSD AVNGSQLHGV SSYVVDALGG
GAEVNADGTI TAPTYTIANA DYDNVGDALN AIDTTLDDAL LWDADAGENG AFSAAHGKDK
TASVITNVAN GAISAASSDA INGSQLYTTN KYIADALGGD AEVNADGTIT APTYTIANAE
YNNVGDALDA LDDNALLWDE TANGGAGAYN ASHDGKASII TNVANGSISE DSTDAVNGSQ
LNATNMMIEQ NTQIINQLAG NTDATYIQEN GAGINYVRTN DDGLAFNDAS AQGVGATAIG
YNSVAKGDSS VAIGQGSYSD VDTGIALGSS SVSSRVIAKG SRDTSITENG VVIGYDTTDG
ELLGALSIGD DGKYRQIINV ADGSEAHDAV TVRQLQNAIG AVATTPTKYF HANSTEEDSL
AVGTDSLAMG AKTIVNGDKG IGIGYGAYVD ANALNGIAIG SNAQVIHVNS IAIGNGSTTT
RGAQTNYTAY NMDAPQNSVG EFSVGSADGQ RQITNVAAGS ADTDAVNVGQ LKVTDAQVSQ
NTQSITNLDN RVTNLDSRVT NIENGIGDIV TTGSTKYFKT NTDGVDASAQ GKDSVAIGSG
SIAAADNSVA LGTGSVATEE NTISVGSSTN QRRITNVAAG KNATDAVNVA QLKSSEAGGV
RYDTKADGSI DYSNITLGGG NGGTTRISNV SAGVNNNDVV NYAQLKQSVQ ETKQYTDQRM
VEMDNKLSKT ESKLSGGIAS AMAMTGLPQA YTPGASMASI GGGTYNGESA VALGVSMVSA
NGRWVYKLQG STNSQGEYSA ALGAGIQW