Gene Cthe_0654 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_0654 
SymbolthiH 
ID4808184 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp806431 
End bp807807 
Gene Length1377 bp 
Protein Length458 aa 
Translation table11 
GC content40% 
IMG OID640106069 
Productthiamine biosynthesis protein ThiH 
Protein accessionYP_001037082 
Protein GI125973172 
COG category[H] Coenzyme transport and metabolism
[R] General function prediction only 
COG ID[COG1060] Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 
TIGRFAM ID[TIGR02351] thiazole biosynthesis protein ThiH 


Plasmid Coverage information

Num covering plasmid clones24 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGGTTGAAA AAGTTGATTT TATAAAAGAA GATTTGATTT TTTCTCTTCT TGAAAAAGGT 
AAAATTACTG ACAGGAATGA AATAAGAGAA ATATTGGCAA AAGCCAGGGA GTGCAAGGGC
ATAAGCCTGG GAGAAGTTGC AAAGCTGCTT TACCTGGAAG ACGAAGAGCT GTTGGAAGAA
CTCTATGATG TGGCCAAATA TATAAAAAAC AAAATATACG GAAAAAGAGT GGTTTTGTTT
GCTCCTTTAT ATACAAGCAA TGAGTGTACA AACAACTGCC TTTACTGCGG TTTCAGGCAT
GACAACAAAG AGCTTCACAG AAAGACTTTA AGTCTTGAAG AAATAGTGGA AGAAGCAAAA
GCTATTGAAA GACAGGGACA TAAAAGATTG CTCCTTATTT GCGGAGAAGA TCCCAGAAAG
ACTAATGTAA AGCATTTTAC CGATGCAATG GAGGCAATAT ATAAATCCAC CGATATCAGA
AGAATCAATG TCGAGGCCGC GCCGATGACG GTGGAAGATT ACAGGGAGCT GAAGAAAGCC
GGTATAGGAA CTTATGTGAT ATTCCAGGAG ACATATCACA GGGAAACCTA TAGAATAATG
CATCCTGTGG GCAAAAAAGC GAATTATGAC TGGCGTATAA CGGCAATAGA CAGAGCCTTT
GAAGGCGGTA TTGACGATGT GGGTGTCGGA GCGCTCTTCG GACTTTATGA TTATAGATTC
GAAGTTTTAG GCCTTTTGAT GCATTGTATG CACTTTGAAG AAAAATATGG TGTAGGGCCG
CACACCATAT CTGTTCCGAG ACTTCGTCCT GCCTTGGGAG CTCCTTTGAA AGAGATTCCG
TACAAAGTTA CCGACAAGGA TTTCAAGAAA ATTGTGGCCA TATTCCGAAT TGCTGTTCCA
TATACGGGAA TTATTCTTTC AACAAGAGAA AGAGCTGAAT TCAGGGATGA ACTTTTAAGT
GTTGGAGTAT CTCAGATAAG TGCAGGTTCC AAAACCAATC CCGGAGGATA CCAGGAGGAT
GACGACCATG CGGATCAGTT TGAAATAAGT GACAACAGGA GTTTGCCGAA AGTAATGGAA
ACAATATGTC AACAAGGTTA TATTCCAAGC TTTTGCACCG CTTGCTACAG AAGATGTCGT
ACCGGGGAGC ATTTCATGGA ATATGCAAAG GCCGGGGATA TACATGAATT CTGTCAGCCG
AATGCCATTC TAACTTTCAA GGAAAACTTA ATGGATTATG CCGATGAGCC TTTGAGAAAA
ATGGGTGAGG AAGTAATACT TAAGGCTCTG GAAGAAATTG AAGATGAAAA AATGAAGACT
CTTACTATTG CAAAACTTGA GGAAATTGAA AAAGGAAAGA GAGATATTTA TTTTTAA
 
Protein sequence
MVEKVDFIKE DLIFSLLEKG KITDRNEIRE ILAKARECKG ISLGEVAKLL YLEDEELLEE 
LYDVAKYIKN KIYGKRVVLF APLYTSNECT NNCLYCGFRH DNKELHRKTL SLEEIVEEAK
AIERQGHKRL LLICGEDPRK TNVKHFTDAM EAIYKSTDIR RINVEAAPMT VEDYRELKKA
GIGTYVIFQE TYHRETYRIM HPVGKKANYD WRITAIDRAF EGGIDDVGVG ALFGLYDYRF
EVLGLLMHCM HFEEKYGVGP HTISVPRLRP ALGAPLKEIP YKVTDKDFKK IVAIFRIAVP
YTGIILSTRE RAEFRDELLS VGVSQISAGS KTNPGGYQED DDHADQFEIS DNRSLPKVME
TICQQGYIPS FCTACYRRCR TGEHFMEYAK AGDIHEFCQP NAILTFKENL MDYADEPLRK
MGEEVILKAL EEIEDEKMKT LTIAKLEEIE KGKRDIYF