Gene ECH74115_5457 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagECH74115_5457 
SymbolthiH 
ID6972325 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli O157:H7 str. EC4115 
KingdomBacteria 
Replicon accessionNC_011353 
Strand
Start bp5102673 
End bp5103806 
Gene Length1134 bp 
Protein Length377 aa 
Translation table11 
GC content55% 
IMG OID643389104 
Productthiamine biosynthesis protein ThiH 
Protein accessionYP_002273505 
Protein GI209396336 
COG category[H] Coenzyme transport and metabolism
[R] General function prediction only 
COG ID[COG1060] Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 
TIGRFAM ID[TIGR02351] thiazole biosynthesis protein ThiH 


Plasmid Coverage information

Num covering plasmid clones22 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones43 
Fosmid unclonability p-value0.0848103 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAAAACCT TCAGCGATCG CTGGCGACAA CTGGACTGGG ACGACATCCG CCTGCGTATC 
AACGGCAAAA CGGCTGCTGA CGTAGAGCGG GCGCTAAATG CCTCGCAACT CACCCGCGAC
GATATGATGG CGCTGTTATC GCCAGCCGCC AGTGGCTATC TGGAACAACT GGCCCAACGG
GCGCAGCGTC TGACCCGTCA GCGATTTGGC AACACAGTTA GTTTCTACGT CCCGCTTTAT
CTTTCCAATC TTTGCGCTAA CGACTGCACG TACTGCGGAT TCTCTATGAG CAACCGCATC
AAGCGCAAAA CGCTGGATGA AGCGGATATT GCCAGGGAAA GCGCCGCTAT ACGGGAGATG
GGCTTTGAAC ATCTGCTTTT AGTCACCGGT GAACATCAGG CGAAAGTGGG GATGGATTAC
TTTCGTCGTC ATCTCCCTGC CCTGCGTGAA CAGTTCTCTT CACTACAGAT GGAAGTACAG
CCGCTGGCGG AGACGGAATA CGCCGAGTTA AAGCAACTTG GTCTGGATGG CGTGATGGTT
TATCAGGAGA CATATCACGA GGCGACTTAT GCCCGCCATC ATCTGAAAGG CAAAAAACAG
GACTTCTTCT GGCGGCTGGA AACGCCGGAT CGGCTGGGCC GTGCGGGGAT TGATAAGATA
GGCCTCGGCG CGCTAATTGG CCTTTCCGAC AGCTGGCGCG TTGACTGCTA TATGGTTGCA
GAACATTTGC TATGGCTGCA ACAGCATTAC TGGCAAAGCC GCTACTCTGT CTCCTTCCCA
CGCCTGCGTC CATGTACTGG CGGCATTGAG CCTGCGTCGA TTATGGATGA ACGCCAGTTA
GTGCAAACCA TCTGCGCCTT CCGGCTGCTT GCACCGGAGA TTGAACTGTC ACTCTCCACG
CGGGAATCAC CGTGGTTTCG CGATCGCGTT ATTCCGCTGG CAATTAATAA CGTCAGCGCT
TTTTCGAAAA CGCAGCCAGG TGGCTATGCC GACAATCACC CCGAGCTGGA ACAGTTCTCA
CCGCACGACG AGCGCAGACC GGAAGCGGTT GCTGCCGCGT TAACCGCTCA GGGTTTGCAG
CCGGTATGGA AAGACTGGGA CAGCTATCTG GGACGCGCCT CGCAAAGACT ATGA
 
Protein sequence
MKTFSDRWRQ LDWDDIRLRI NGKTAADVER ALNASQLTRD DMMALLSPAA SGYLEQLAQR 
AQRLTRQRFG NTVSFYVPLY LSNLCANDCT YCGFSMSNRI KRKTLDEADI ARESAAIREM
GFEHLLLVTG EHQAKVGMDY FRRHLPALRE QFSSLQMEVQ PLAETEYAEL KQLGLDGVMV
YQETYHEATY ARHHLKGKKQ DFFWRLETPD RLGRAGIDKI GLGALIGLSD SWRVDCYMVA
EHLLWLQQHY WQSRYSVSFP RLRPCTGGIE PASIMDERQL VQTICAFRLL APEIELSLST
RESPWFRDRV IPLAINNVSA FSKTQPGGYA DNHPELEQFS PHDERRPEAV AAALTAQGLQ
PVWKDWDSYL GRASQRL