Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | EcSMS35_3939 |
Symbol | |
ID | 6146732 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Escherichia coli SMS-3-5 |
Kingdom | Bacteria |
Replicon accession | NC_010498 |
Strand | + |
Start bp | 4013819 |
End bp | 4018468 |
Gene Length | 4650 bp |
Protein Length | 1549 aa |
Translation table | 11 |
GC content | 53% |
IMG OID | 641618765 |
Product | haemagluttinin family protein |
Protein accession | YP_001745904 |
Protein GI | 170684255 |
COG category | [U] Intracellular trafficking, secretion, and vesicular transport [W] Extracellular structures |
COG ID | [COG5295] Autotransporter adhesin |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 11 |
Plasmid unclonability p-value | 1 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | 42 |
Fosmid unclonability p-value | 0.78822 |
Fosmid Hitchhiker | No |
Fosmid clonability | normal |
| |
Sequence |
Gene sequence | ATGAACAAAA TTTTTAAAGT TATCTGGAAT CCAGCGACGG GCAGTTACAG CGTTGCCAGC GAAACGGCGA AAAGTCGTGG GAAGAAGAGC GGGCGCAGTA AGCTGTTAAT TTCTGCACTG GTTGCGGGTG GATTATTGTC GTCGTCTGGG GCATTAGCTC AGGCAGGGTT AGATACAGGT ACTGGTGTTA CCCCTGCTGG TCATAATAAT GGAACAGGTT GGATAGCTAT TGGTACCGAT GCTGAAGCAA GCACTCATAC CACAACGAAT GGCGCAGCAA CTGCCGTGGG CTACTACTCC AAAGCGTTAG GTATGTGGAG TACTGCGTTA GGTGCATATA GCGAATCGAA TGGCAATGCT TCACTGGCTC TTGGGGTTAA AGCACAATCT AACGGTGACC GCTCTATTTC GATGGGCGCT TCGTCGAGTG CAAGCATAAA TGCGGGTTAC TCCATCGCGA TGGGGGTATT TGCTTTCACC GATGCAGAGT ACGCGGTAGC CCTTGGCAAT GAGAGCAAGG CACTTGGTAA ATACAGCCTT GCATTAGGAA ACGCAAGTCA GGCATCTGGC GAATCCAGTA TTGCATTAGG TAACACAAGT GAAGCCAGCG AACAAAACGC GATTGCGCTG GGGCAAGGCA GCATTGCGAG CAAAGTGAAC TCAATCGCGT TGGGCAGTAA CAGTTCGTCT GCAGGAGAGA ATGCCATCGC GCTAGGAGAG GGCAGTGCCG CGGGAGGCAG CAACAGTCTA GCTTTCGGTA GTCAATCCAG GGCAAGCGGC AATGATTCTG TCGCCCTCGG TGTAGGGGCT ACAGCAGCGA CCGACAATTC TGTCGCTATC GGCGCAGGAT CGACCACTGA TGCAAGCAAT ACAGTTTCAG TTGGCAACAG CACAACAAAA CGCAAAATTG TTAATATGGC TGCCGGTGCC ATAAGCAACA CCAGTACCGA TGCCATCAAC GGCTCACAGC TTTATACGAT CAGTGATTCA GTCGCCAAAC GGCTCGGGGG AGGCGCTACT GTAGGCAGCG ACGGCACCGT AACCGCACTA AGCTACGCGT TGAAAAGCGG CACCTATAAT AACGTGGGTG ATGCTCTGTC AGGAATCGAC AATAATACTC TGCAATGGAA TAGAACCGCG GGGGCATTCA CCGCTGCACA CGGCTCAAAT ACCACCAGTA AAATCACTAA TGTTGCTAAA GGTACGGTTT CTGCAACCAG CACCGATGTA GTAAACGGCT CTCAATTGTA CGACCTGCAG CAGGATGCTC TGTTGTGGAA CGGCACCGCG TTCAGCGCTG CACACGGCAC CGACGTCACC AGCAAAATCA CCAATGTCAC CGCTGGTGAG CTCTCTGACA CCAGCACCGA CGCCGTCAAC GGTTCTCAGC TGAAAGCGAC CAAGGACGAT GTGGCGGCAA ACACCACCAA CATCACTAAC CTGACGAGCG AAGTGGCTGG CAACACCACC AGTATCACTA ACCTGACTGA TACGGTGACT AACCTCGGTG AAGACGCCCT GAAATGGGAC GACGCCGCAG GCGCATTCAC CGCTGCACAC GGCACTAACG CCACCAACAA AATCACCAAT GTCACCGCTG GCGAACTCTC TGATACCAGC ACCGACGCCG TCAACGGTTC TCAGCTGAAA ACCACCAACG ATAACGTGGC GACCAACACC ACCAATATCG CCACTAACAC CACCAATATC ACCAACCTGA CTAACGCTGT TGACAGTCTC GGTGATGATT CCCTGCTGTG GAACAAAGCG GCTGGCGCAT TCAGCGCCGC GCACGGCACC GATGCCACCA GCAAAATCAC CAACGTCACC GCTGGCGACC TGACTGCTGG CAGCACCGAC GCCGTCAACG GTTCTCAGCT GAAAACCACC AACGATAACG TGGCGACCAA CACCACCAAT ATCGCCACTA ACACCACCAA TATCACCAAC CTGACTAACG CTGTTGACAG TCTTGGTGAT GATTCCCTGC TGTGGAACAA AGCGGCTGGC GCATTCAGCG CCGCGCACGG CACCGATGCC ACCAGCAAAA TCACCAACGT CACCGCTGGC GACCTGACTG CTGGCAGCAC TGACGCGGTT AACGGCTCCC AGCTGAAAAC CACCAACGAT AACGTGGCGA CCAACACCAC CAATATCGCC ACTAACACCA CCAATATCAC TAACCTGACT GATACGGTGA ATAATCTCGG TGAAGACGCC CTGAAATGGG ACGACGCCGC AGGCGCATTC ACCGCTGCAC ACGGTACTAA CGCCACCAAC AAAATCAGCA ACGTACAAGC TGGCATAGTC TCCTCTGACA GCACTGACGC CATAAATGGC TCACAACTAT ATGGTTTGGC TGATTCATTC ACGTCCTATC TGGGTGGTGG TGCTGATATT AGCGATACAG GTGTATTAAC CGGGCCAACC TATAGTATTG GCGGCACTGA CTACACTAAC GTCGGGGATG CTCTGGCCGC AATTAACACT TCATTTAGTG ATTCTCTCGG TGATGCCCTG CTCTGGGATG CGACAGCCAA TGACGGTGCT GGTGCATTCA GCGCCGGTCG CGGGGTAGAT AACACCGCCA GTAAGATTAC TAACGTCGCA AATGGTGCAA TCTCTGCCAC CAGCAGCGAC GCGATTAACG GCTCACAACT CTATACCACC AATAAGTACA TCGCTGATGC GCTGGGCGGT AACGCAGAAG TCAACGCTGA CGGCACTATC ACTGCGCCGA CTTACACCAT TGCAAATACC GATTACAACA ACGTCGGTGA AGCTCTGGAT GCGCTTGATG AGAACGCGTT GCTGTGGGAT GCGACAGCCA ATAACGGCGA AGGGGCTTAC AACGCCAGTC ATGATGGCAA AGCCAGCATC ATCACTAATG TCGCTGATGG TAATATCGGG GAAGGCAGCA CCGATGCTAT CAACGGTTCT CAGCTGTTTA ACACCAATAT GCTGATCCAG CAGAACAGCG AAGTCATTAA TCAGCTTGCT GGTAACACCA GTGAAACCTA CATCGAAGAA AATGGTGCAG GTATTAACTA TGTGCGTACC AATGACACCG GTTTAACCTT CACCGATGCC AGCGCACAGG GTGTTGGCGC TACAGCAGTG GGTTATAACT CTGTTGCTTC CAAAGCCAGC AGCGTAGCCA TTGGTCAGGA CAGCCGCAGC GAAGTTGAGA CGGGTATCGC CCTGGGTAGC AGTTCCGTTT CCAGCCGTTT AATAGTTAAA GGTTCTCGTG ACACCAGCGT GTCGGAAGAA GGTGTTGTGA TTGGTTATGA CACAACTGAT GGTGAACTGC TTGGCGCATT GTCGATCGGT GACGATGGTA AATATCGTCA AATCATCAAC GTAGCCGATG GTTCCGAAGC CCATGACGCC GTTACGGTTC GCCAGTTGCA AAACGCTATT GGCGCGGTCG CCACTACGCC AACCAAGTAC TATCACGCCA ACTCAACGGC AGAAGACTCA CTGGCAGTCG GTGAAGACTC GCTGGCAATG GGCGCAAAAA CCATCGTTAA TGGTAATGCG GGTATTGGTA TCGGCCTGAA CACTTTAGTT CTGGCTGATG CGATCAATGG TATTGCTATC GGTAGCAACG CAAGTGCAAA CCATGCAAAC AGCATTGCAA TGGGGAATGG TTCTCAGACT ACCCGTGGTG CGCAGACCAA CTACAGCGCC TACAACATGG ACGCACCACA GAACTCTGTG GGTGAGTTCT CTGTCGGCAG TGAAGACGGT CAACGTCAGA TCACCAACGT CGCGGCTGGT TCAGCGGATA CCGATGCGGT TAACGTGGGT CAGTTGAAAG TAACGGACGC GCAGGTTTCC CAGAATACCC AGAGCATTAC TAACCTGAAC AATCAGGTGA CGAATCTGGA TACTCGCGTG ACCAATATCG AAAACGGCAT TGGCGATATC GTAACCACCG GTAGCACCAA ATACTTCAAG ACCAACACCG ATGGCGTAGA TGCCAACGCG CAGGGTAAAG ACAGTGTTGC AATTGGTTCT GGTTCCATTG CTGCCGCTGA CAACAGCGTC GCGCTGGGTA CCGGTTCCGT GGCCAACGAA GAAAACACCA TCTCTGTGGG TTCTTCTACC AACCAGCGCC GTATCACCAA CGTTGCTGCC GGTGTTAATG CCACCGATGC GGTTAACGTT TCACAACTGA AGTCTTCTGA AGCAGGCGGC GTTCGCTACG ACACCAAAGC TGATGGCTCT ATCGACTACA GCAACATCAC TCTCGGTGGC GGAAATGGCG GTACGACTCG CATCAGCAAC GTTTCTGCTG GCGTGAACAA CAACGACGCG GTGAACTATG CGCAGCTGAA GCAAAGTGTG CAGGAAACGA AGCAATACAC CGATCAGCGC ATGGTTGAGA TGGATAACAA ACTGTCCAAA ACCGAAAGCA AGTTGAGCGG TGGTATCGCT TCCGCAATGG CAATGACCGG TCTGCCGCAG GCTTACACGC CGGGAGCCAG CATGGCCTCT ATTGGTGGTG GTACTTACAA CGGTGAATCG GCTGTTGCTT TAGGTGTGTC GATGGTGAGC GCCAATGGTC GTTGGGTCTA CAAATTACAA GGTAGTACCA ATAGCCAGGG TGAATACTCC GCCGCACTCG GTGCCGGTAT TCAGTGGTAA
|
Protein sequence | MNKIFKVIWN PATGSYSVAS ETAKSRGKKS GRSKLLISAL VAGGLLSSSG ALAQAGLDTG TGVTPAGHNN GTGWIAIGTD AEASTHTTTN GAATAVGYYS KALGMWSTAL GAYSESNGNA SLALGVKAQS NGDRSISMGA SSSASINAGY SIAMGVFAFT DAEYAVALGN ESKALGKYSL ALGNASQASG ESSIALGNTS EASEQNAIAL GQGSIASKVN SIALGSNSSS AGENAIALGE GSAAGGSNSL AFGSQSRASG NDSVALGVGA TAATDNSVAI GAGSTTDASN TVSVGNSTTK RKIVNMAAGA ISNTSTDAIN GSQLYTISDS VAKRLGGGAT VGSDGTVTAL SYALKSGTYN NVGDALSGID NNTLQWNRTA GAFTAAHGSN TTSKITNVAK GTVSATSTDV VNGSQLYDLQ QDALLWNGTA FSAAHGTDVT SKITNVTAGE LSDTSTDAVN GSQLKATKDD VAANTTNITN LTSEVAGNTT SITNLTDTVT NLGEDALKWD DAAGAFTAAH GTNATNKITN VTAGELSDTS TDAVNGSQLK TTNDNVATNT TNIATNTTNI TNLTNAVDSL GDDSLLWNKA AGAFSAAHGT DATSKITNVT AGDLTAGSTD AVNGSQLKTT NDNVATNTTN IATNTTNITN LTNAVDSLGD DSLLWNKAAG AFSAAHGTDA TSKITNVTAG DLTAGSTDAV NGSQLKTTND NVATNTTNIA TNTTNITNLT DTVNNLGEDA LKWDDAAGAF TAAHGTNATN KISNVQAGIV SSDSTDAING SQLYGLADSF TSYLGGGADI SDTGVLTGPT YSIGGTDYTN VGDALAAINT SFSDSLGDAL LWDATANDGA GAFSAGRGVD NTASKITNVA NGAISATSSD AINGSQLYTT NKYIADALGG NAEVNADGTI TAPTYTIANT DYNNVGEALD ALDENALLWD ATANNGEGAY NASHDGKASI ITNVADGNIG EGSTDAINGS QLFNTNMLIQ QNSEVINQLA GNTSETYIEE NGAGINYVRT NDTGLTFTDA SAQGVGATAV GYNSVASKAS SVAIGQDSRS EVETGIALGS SSVSSRLIVK GSRDTSVSEE GVVIGYDTTD GELLGALSIG DDGKYRQIIN VADGSEAHDA VTVRQLQNAI GAVATTPTKY YHANSTAEDS LAVGEDSLAM GAKTIVNGNA GIGIGLNTLV LADAINGIAI GSNASANHAN SIAMGNGSQT TRGAQTNYSA YNMDAPQNSV GEFSVGSEDG QRQITNVAAG SADTDAVNVG QLKVTDAQVS QNTQSITNLN NQVTNLDTRV TNIENGIGDI VTTGSTKYFK TNTDGVDANA QGKDSVAIGS GSIAAADNSV ALGTGSVANE ENTISVGSST NQRRITNVAA GVNATDAVNV SQLKSSEAGG VRYDTKADGS IDYSNITLGG GNGGTTRISN VSAGVNNNDA VNYAQLKQSV QETKQYTDQR MVEMDNKLSK TESKLSGGIA SAMAMTGLPQ AYTPGASMAS IGGGTYNGES AVALGVSMVS ANGRWVYKLQ GSTNSQGEYS AALGAGIQW
|
| |