Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | EcHS_A3814 |
Symbol | |
ID | 5593200 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Escherichia coli HS |
Kingdom | Bacteria |
Replicon accession | NC_009800 |
Strand | + |
Start bp | 3804828 |
End bp | 3809852 |
Gene Length | 5025 bp |
Protein Length | 1674 aa |
Translation table | 11 |
GC content | 50% |
IMG OID | 640922926 |
Product | putative haemagglutinin |
Protein accession | YP_001460404 |
Protein GI | 157163086 |
COG category | [U] Intracellular trafficking, secretion, and vesicular transport [W] Extracellular structures |
COG ID | [COG5295] Autotransporter adhesin |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 14 |
Plasmid unclonability p-value | 0.0557041 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGAACAAAA TATTTAAAGT TATCTGGAAC CCTGCCACAG GGAATTATAC TGTTACCAGC GAAACGGCAA AAAGCCGTGG CAAGAAATCT GGGCGCAGTA AGCTGTTAAT TTCTGCGCTG GTTGCGGGTG GGTTGTTGTC ATCGTTTGGG GCGCTGGCTG ATAATTACGA CGGTCAGGGC GTTGATTACG GCGATGGCTC AGCTAGTGAC GGCTGGGTTG CTATCGGTAA AGGTGCAAAA GCAAATATTT TTTTAAACAA CGCTGGTGCC AGTACCGCTT TAGGTTATGA CGCTATAGCT GAAGGCCAAT ATAGCTCTGC CATCGGCTCA AAAACCCATG CTATTGGTGG TGCATCAATG GCCTTTGGGG TTAGTGCAAT CTCTGAAGGT GACAGGAGTA TCGCGCTGGG TGCATCGTCG TATTCATTCG GCCAATACTC AATGGCCCTT GGTCGTTATT CCAAAGCGCT GGGTAGATTG TCTATAGCTA TGGGGGATAG CTCCAAAGCG GATGGAGCAA ACGCCATTGC GCTGGGAAAT GCCGCTAAGG CGGCTGGTAT TATGAGCATC GGTCTCGGTG ATAATGCCAA TGCGTCACAA GATTATGCTA TGGCGCTGGG AGCAGAAAGC GAAGCCGCTG AAAATGCGAC CGCTATCGGC AATAAAGCGC ATGCAAAAGG AGTGAATAGC ATCGCGTTGG GTAATGGCAG CCAGGCTCTG GCAGATAGTG CAATTGCTAT TGGCCAGGGC AACAAGGCTA ACGGCGCTGA TGCTATCGCT CTGGGTAATG GTAGCCAGTC GAGTGGCTTA AACGCTATTG CCTTAGGTAA GGCCAGTGTT GTAACTGGCG ATAATAGTCT TGCGCTTGGG AGCAATACCA ATGCCAACGG TATTAACTCT GTCGCGCTGG GCGCAGGTTC TATTGCGGAT CAAGACGATA GCGTTTCTGT CGGCAGTGAC TCATTACAAC GCAAGATCGT TAATGTAAAA AATGGCACGA TCAAGGCTGA TAGCCACGAC GCCATTAATG GCTCACAGCT TTATGCCATC AGCGACTCGG TGGCAAAACG GCTTGGCGGG GGTTCATCGG TAAACGTTGA CGACGGTACT GTTAAAGCAC CAACCTACAA TTTAAAAAAT GGTAACAAAA ATAACGTTGG CGATGCGCTC ACTGTACTCG ATCAATTCAC CCTGCAATGG GATCAAAACA GAGACAAATA CAGCGCCGCT CATGGTAGCT CAACTGCCAG CGTAATCACC GATGTGGCGG ATGGCGCAGT TTCAGATTCC AGTAAAGACG CGGTTAACGG TTCACAACTG AAAGCCACCA ATGATGATGT TGAGACCAAT ACCACCAATA TCGCTACCAA TACCGGAAAT ATTGCAACTA ACACTGCAAA TATTGCTACC AACACCACCA ATATCACTAA TTTGACAGAT ACTGTTGGCG ACCTTAAAGA TGATGCCCTG CTCTGGAATG GCACTGCATT CAACGCCGCT CATGGTACGG AAACCACCAG CACCATCACC AACGTGAAAG CTGGTACCCT TTCGGATGAC AGTACTGACG CGGTTAATGG CTCGCAACTG AAAGACACTA ACGATAACGT GGCGACCAAC ACCACCAATA TCGCTAGCAA CACGGCTAAC ATTGCCACTA ACACCAGTAA TATTGCCGAT AACACTGCCA ACATTGCTAC CAACACCAGT AATATTGCCG ATAACACTGC CAACATTGCG ACCAACACCA GCAATATTGC AGGTAACACT GCCAACATTG CTACCAATAC CACTAATATC GCGGCTAACA CCACAAGCAT AAATAGCCTG AACACGTCTG TGGATGCTCT TGAACAGGAT GCTATGCTCT GGAACGGCAC TGCATTCAAC GCCGCTCATG GTACGGAAAC CACCAGCACC ATCACTAATG TGAAAGCTGG CACCCTTTCG GATGACAGTA CTGACGCGGT TAATGGCTCG CAACTGAAAG CTACTAACGA TAACGTGGCA ACCAACACCA CCAATATCGC CAGCAATACG GCTAACATTG CGACCAACAC CGCCAATATT AATACTCTCA ATACGTCGAT CGATACCCTT GAGCAGGATG CAATACTCTG GAACGGAACA GCGTATAGCG CCGCGCACGG TACAGAAACA GCCAGCACTA TTACCAACGT TAAAGCAGGT ACTTTATCTG AAAATAGTAC GGACGCAGTT AACGGTGCGC AGCTGAACGC GACTAACGCG AATGTGGCGA CCAACACCAC CAATATTGCT ACTAACACCG CCAGCATTAA TACTCTGAAT ACATCCATTG ACGCTCTGGA ACAGGATGCG CTGCTTTGGG ATGGCACTGC ATTCAGTGCA GCGCATGGTG CAAACAAAGA TGCCAGCAAA ATCACCAATG TCCTGGCAGG GACTGTATCA TCCGCCAGTA CTGATGCCAT TAATGGCAGC CAGCTTCATG GGTTGAGCAG TTCGATCGCC ACCTATCTTG GTGGTGGTGC CACTGTGAGC GATTCTGGCG TATTTAGCGG CCCTACCTAT AACATTGATG GTAATGATTA CACCAATGTT GGTGCTGCAC TTGACGCCAT TAACACCTCA CTTAGCGACT CACTCGGCGA TGCCCTCCTT TGGGATAGCA CGACTGGCGC ATTTAGTGCC AAACACGGTT CTACCGCCAG CGTAATCACT AACGTCGCAG ACGGTGCAGT CTCTGACTCT AGCTCTGATG CTGTGAATGG TTCACAACTG TACGATGTAA GCAACTCTGT TGTCGATGTT CTGGGTGGTG GTGCTGGCGT GAATACGGAT GGCAGCATCA GTGCACCAAC GTACACCATT GCTAACACTG ATTACGATAA TGTCGGCGAT GCCCTAAACG CGCTTGATAC CACTCTTGAC GATGCGATGC TATGGGATGC TACCGCAGGT GAAAATGGTG CCTTCAGTGC CAGCCACGAT GGCAGTGCCA GCAAAATAAC TAACGTCGCG GCAGGAACAA TTTCTGACAC CAGCACCGAT GCGGTTAATG GGGCTCAACT CCACGGCGTT AGCAGTTCCG TCGCTGAGGC TCTTGGTGGT GGTGCGGCGG TGAATTCTGA CGGCAGCATC AGCGCACCGA CTTACACCAT TGCTGATACC GACTATACCA ACGTCGGCGA TGCGATGAAT GCAATTGACT CAACCCTGGA TAACGCCTTG CTTTGGGACG CTACCGCAGG TGAAAACGGT GCGTTTAACG CCAGCCATGA TGGCAAAGCC AGTGTCATCA CCAACGTAGC CAATGGTCAG ATTAACGAAA CCAGCACCGA CGCGGTTAAC GGTTCCCAGT TAAATGCCAC GAATATGTTG ATCCAGAATA TTGCCGGTGA TACCAGCGAA AGCTATATCA CAGAGAACGG TGAAGGTATC AACTATGTGC GTACTAACGA CAGCGGCTTA GTCTTTGAGG ATGCCAGTGC TACGGGTGTG GGCGCTACAG CTGTAGGTTA TAACTCCGTC GCATCAGGCG ATAGCAGTGT GGCAATTGGA CAGAACAGCA GCAGTACTAT TGAATCTGGA ATTGCTCTGG GCAGCAGTTC TGTGTCTAAC CGTGTCATCC TTCAGGGCTC CCGTGATACC AGCGTAACCG AAGATGGCGT AGTCATTGGT TATAACACCA GTGATGGCGA ACTGCTTGGC GCGTTGTCAA TTGGTGATGA CGGTAAATAT CGTCAAATCA TCAACGTCGC CGATGGTTCC GAAGCCCATG ACGCCGTTAC GGTTCGCCAG TTGCAGAATG CGATTGGTGC TGTAGCAACG ACACCGTCCA AATACTTCCA TGCTAATTCA ACGGAAGAAG ATTCACTGGC TGTCGGTGAA GACTCACTGG CAATGGGTGC GAAAACCATT GTTAATGGTG ATGCCGGGAT CGGTATTGGC CTGAACACTC TGGTGTTAAC TGATGCAATC AACGGTATTG CTATCGGTAG CAACGCGAGT GCAAATCATG CAAACAGTAT TGCGATGGGT AGTGGTTCCC AGACCACCCG TGGTGCGCAG ACTGACTACA CCGCCTACAA CATGGACGCG CCGCAGAATT CTGTCGGTGA ATTCTCTGTC GGCAGCGAAG ACGGTCAACG TCAGATCACC AACGTCGCGG CTGGTTCAGC GGATACCGAT GCGGTTAACG TAGGTCAGTT GAAAGTCACT GATGAGCGCG TAGCGCAAAA TACCCAGAGC ATTACTAACC TGAACAATCA GGTCACTAAT CTGGATACTC GCGTTACTAA TATCGAAAAC GGTATTGGCG ACATTGTCAC CACCGGTAGC ACCAAGTACT TCAAGATCAA CACCGATGGC GTAGATGCCA ACGCCCAGGG TAAAGATAGC GTTGCTATTG GTTCTGGTTC CATTGCTGCC GCTGACAACA GCGTCGCACT GGGTACCGGT TCCGTTGCAG ATGAAGAAAA TACAATCTCT GTAGGTTCTT CCACTAACCA ACGCCGTATT ACTAACGTTG CCGCAGGTAA AAATGCTACC GATGCTGTTA ACGTTGCGCA GTTGAAGTCT TCTGAGGCGG GCGGCGTGCG TTACGACACC AAAGCTGATG GTTCTATCGA CTATAGCAAT ATCACCCTCG GTGGCGGCAA CGGTGGTACG ACTCGTATCA GCAACGTCTC CGCTGGCGTC AACAACAACG ACGCGGTGAA CTACGCGCAG TTGAAGCAAA GCGTGCAGGA AACGAAGCAA TACACCGATC AGCGGATGGT TGAGATGGAT AACAAACTGT CTAAAACCGA AAGCAAGTTG AGTGGTGGTA TCGCTTCTGC AATGGCAATG ACCGGTCTGC CGCAGGCTTA TACACCGGGT GCCAGCATGG CTTCTATTGG TGGCGGTACT TACAACGGTG AATCGGCAGT TGCTTTAGGT GTATCGATGG TGAGCGCCAA TGGTCGTTGG GTCTACAAAT TACAAGGTAG TACCAATAGC CAGGGTGAAT ACTCCGCCGC ACTCGGTGCC GGTATTCAGT GGTAA
|
Protein sequence | MNKIFKVIWN PATGNYTVTS ETAKSRGKKS GRSKLLISAL VAGGLLSSFG ALADNYDGQG VDYGDGSASD GWVAIGKGAK ANIFLNNAGA STALGYDAIA EGQYSSAIGS KTHAIGGASM AFGVSAISEG DRSIALGASS YSFGQYSMAL GRYSKALGRL SIAMGDSSKA DGANAIALGN AAKAAGIMSI GLGDNANASQ DYAMALGAES EAAENATAIG NKAHAKGVNS IALGNGSQAL ADSAIAIGQG NKANGADAIA LGNGSQSSGL NAIALGKASV VTGDNSLALG SNTNANGINS VALGAGSIAD QDDSVSVGSD SLQRKIVNVK NGTIKADSHD AINGSQLYAI SDSVAKRLGG GSSVNVDDGT VKAPTYNLKN GNKNNVGDAL TVLDQFTLQW DQNRDKYSAA HGSSTASVIT DVADGAVSDS SKDAVNGSQL KATNDDVETN TTNIATNTGN IATNTANIAT NTTNITNLTD TVGDLKDDAL LWNGTAFNAA HGTETTSTIT NVKAGTLSDD STDAVNGSQL KDTNDNVATN TTNIASNTAN IATNTSNIAD NTANIATNTS NIADNTANIA TNTSNIAGNT ANIATNTTNI AANTTSINSL NTSVDALEQD AMLWNGTAFN AAHGTETTST ITNVKAGTLS DDSTDAVNGS QLKATNDNVA TNTTNIASNT ANIATNTANI NTLNTSIDTL EQDAILWNGT AYSAAHGTET ASTITNVKAG TLSENSTDAV NGAQLNATNA NVATNTTNIA TNTASINTLN TSIDALEQDA LLWDGTAFSA AHGANKDASK ITNVLAGTVS SASTDAINGS QLHGLSSSIA TYLGGGATVS DSGVFSGPTY NIDGNDYTNV GAALDAINTS LSDSLGDALL WDSTTGAFSA KHGSTASVIT NVADGAVSDS SSDAVNGSQL YDVSNSVVDV LGGGAGVNTD GSISAPTYTI ANTDYDNVGD ALNALDTTLD DAMLWDATAG ENGAFSASHD GSASKITNVA AGTISDTSTD AVNGAQLHGV SSSVAEALGG GAAVNSDGSI SAPTYTIADT DYTNVGDAMN AIDSTLDNAL LWDATAGENG AFNASHDGKA SVITNVANGQ INETSTDAVN GSQLNATNML IQNIAGDTSE SYITENGEGI NYVRTNDSGL VFEDASATGV GATAVGYNSV ASGDSSVAIG QNSSSTIESG IALGSSSVSN RVILQGSRDT SVTEDGVVIG YNTSDGELLG ALSIGDDGKY RQIINVADGS EAHDAVTVRQ LQNAIGAVAT TPSKYFHANS TEEDSLAVGE DSLAMGAKTI VNGDAGIGIG LNTLVLTDAI NGIAIGSNAS ANHANSIAMG SGSQTTRGAQ TDYTAYNMDA PQNSVGEFSV GSEDGQRQIT NVAAGSADTD AVNVGQLKVT DERVAQNTQS ITNLNNQVTN LDTRVTNIEN GIGDIVTTGS TKYFKINTDG VDANAQGKDS VAIGSGSIAA ADNSVALGTG SVADEENTIS VGSSTNQRRI TNVAAGKNAT DAVNVAQLKS SEAGGVRYDT KADGSIDYSN ITLGGGNGGT TRISNVSAGV NNNDAVNYAQ LKQSVQETKQ YTDQRMVEMD NKLSKTESKL SGGIASAMAM TGLPQAYTPG ASMASIGGGT YNGESAVALG VSMVSANGRW VYKLQGSTNS QGEYSAALGA GIQW
|
| |