Gene Cthe_0599 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_0599 
Symbol 
ID4808201 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp735074 
End bp736183 
Gene Length1110 bp 
Protein Length369 aa 
Translation table11 
GC content42% 
IMG OID640106013 
Productthiazole biosynthesis protein ThiH 
Protein accessionYP_001037027 
Protein GI125973117 
COG category[H] Coenzyme transport and metabolism
[R] General function prediction only 
COG ID[COG1060] Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 
TIGRFAM ID[TIGR02351] thiazole biosynthesis protein ThiH 


Plasmid Coverage information

Num covering plasmid clones32 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAGCTTTT ATGAAAGATA TCTGGAGTAC AAAAATTTTG ATTTTGAAAA TTTCTTCGAC 
CAAGTAACGG ACAGGGATAT TATTAATATA ATAAACAAAG ACCGGCGGCT TTCGGAACTG
GAATTCCTCA TGCTTCTTTC AAAAAAAGCT GTAAAATACC TTGAACCCTT GGCTCAAAAG
GCAAACAGGA TTACGGTGCA GAATTTCGGA AAGGTCATAT TCCTGTACAC GCCGATGTAC
CTTGCAAATT ACTGCGTAAA TCAATGTATT TACTGTGGTT TTAACATAAC CAACAATATA
AAGCGAAGAA AACTTACTTT GGATGAGGTT GAAAAAGAAG CTTATGCAAT TTCATCCACG
GGTCTTAGGC ATATTCTGAT TTTAACGGGA GAGTCCCGAA AGGAAAGTCC TGTTCAATAC
ATAAAGGACT GCGTTAAAAT TCTTCAAAAG TATTTCAGGT CCATATCAAT AGAGGTTTAC
CCTCTTGAGG AGAACGAGTA CGCCGAACTG ATAGAGGCGG GGGTTGACGG TCTTACCATC
TATCAGGAGG TATATGATGA AGAAAAATAC AAGGCTCTTC ACCTGAAAGG TCCCAAAAGA
AACTACTTAT ACAGGCTTGA TGCTCCTGAA AGGGCATGCA AGGCATCAAT GAGGAATGTA
AACATAGGTG CCCTGCTGGG ACTTCATGAC TGGCGGACGG AGGCTTTTTA CACGGGACTT
CACGCTGATT ACCTGCAAAA CAAGTATCCG GATGTGGAAA TTGGTTTGTC CCTTCCAAGA
ATAAGGCCCC ATCCCTGTGG AAGTTTTGTA CCTGATTGCA AAGTGGAAGA CAGGGATCTG
GTACAGATAA TGATAGCCTA CAGATTGTTT ATGCCAAGAG CCGGGATAGC AATTTCTACA
AGAGAAAGAG AGAGCCTTAG AAATAATCTT ATTGGTCTGG GAGTTACCAA AATGTCTGCC
GCGTCAAGTA CAGAGGTCGG AGGTCACACC CTTGGCGATA AAAGTGACGG ACAGTTTGAT
GTAAATGACA GGCGCGGTGT TGAAGAGATG AGACAAATGA TATACAGCAA AGGTTATCAG
CCGGTGTTTA AAGACTGGCA GGCAATATAA
 
Protein sequence
MSFYERYLEY KNFDFENFFD QVTDRDIINI INKDRRLSEL EFLMLLSKKA VKYLEPLAQK 
ANRITVQNFG KVIFLYTPMY LANYCVNQCI YCGFNITNNI KRRKLTLDEV EKEAYAISST
GLRHILILTG ESRKESPVQY IKDCVKILQK YFRSISIEVY PLEENEYAEL IEAGVDGLTI
YQEVYDEEKY KALHLKGPKR NYLYRLDAPE RACKASMRNV NIGALLGLHD WRTEAFYTGL
HADYLQNKYP DVEIGLSLPR IRPHPCGSFV PDCKVEDRDL VQIMIAYRLF MPRAGIAIST
RERESLRNNL IGLGVTKMSA ASSTEVGGHT LGDKSDGQFD VNDRRGVEEM RQMIYSKGYQ
PVFKDWQAI