Gene Ccel_3104 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCcel_3104 
SymbolthiH 
ID7311701 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium cellulolyticum H10 
KingdomBacteria 
Replicon accessionNC_011898 
Strand
Start bp3640231 
End bp3641643 
Gene Length1413 bp 
Protein Length470 aa 
Translation table11 
GC content41% 
IMG OID643610008 
Productthiamine biosynthesis protein ThiH 
Protein accessionYP_002507376 
Protein GI220930467 
COG category[H] Coenzyme transport and metabolism
[R] General function prediction only 
COG ID[COG1060] Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 
TIGRFAM ID[TIGR02351] thiazole biosynthesis protein ThiH 


Plasmid Coverage information

Num covering plasmid clones19 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGTATAACA GTAAATCAAA AAAGGCTGAA GACTTTATTA ACGACGAGGA GATATTAGAA 
ACGTTGGAGT ATGCCCGTAG GAACAAGGAG AATATGTCAC TGATAGAAGA TATTCTTAAA
AAGGCAGCTG AGTACAAAGG ATTAAGCTAT AGGGAAGCAG CAGTATTATT GGAATGTGAG
CTTGACGAGG TTAAAGAAAA AGTGTTCGGT CTTGCAGAGC ATATTAAAAA GAAATTCTAT
GGAAACAGGA TAGTAATGTT TGCACCTCTT TATCTTTCGA ACTACTGTGT AAATGAGTGC
AGATACTGCC CTTACCATGG TTCCAACAAG CATATTTCAA GAAAACAGCT GTCACAGGAG
GATATAGTCA GGGAGGTTAT GGCCTTGCAG GATATGGGTC ACAAACGACT TGCCCTTGAA
ACGGGAGAGG ACCCGGAGAA CTGTCCTATA GAATATGTAT TGGAAAGTAT AAAAACAATT
TACGGAATAA AGCATAAGAA CGGTGCAATC CGCCGTGTCA ATGTGAATAT TGCCGCTACA
ACCATTGAAA ATTACAAAAA ACTCAAAGAT GCGGGAATAG GAACATATAT ACTGTTTCAG
GAAACTTACC ATAAGCCCAC ATACGAGTAC CTGCACCCCA AGGGTCCTAA GCACAATTAT
GCTTATCATA CTGAAGCCAT GGACAGAGCA ATGGAGGGTG GAATTGATGA TGTAGGACTT
GGTGTCCTCT TTGGTCTGAA CCTTTACAAG TATGATTTTG TAGGTCTGCT CATGCACGCA
AAGCATTTGG AGGATGCAAT GGGAGTTGGG CCTCATACAA TCAGTGTACC ACGTATAAGG
CCGGCTGATG ATGTGGATTT GAAAGAATAT TCAAATGCAA TACCTGACTC TATATTTGAA
AAAATTGTAG CTATACTTCG TATAGCGGTA CCATACACAG GTATAATCAT GTCCACAAGA
GAATCAGAAA AGACCCGTGG GGAATGTCTC AAATTGGGTG TTTCTCAAAT TAGCGGAGGA
TCATCAACAA GTGTGGGCGG TTATGTAGAA AAAGAAGCAG AGAATTCTGC ACAGTTCGAA
GTTAACGATA CAAGAACCAT GGACGAAGTA GTTAACTGGC TCCTGACATT GGGGTATATT
CCAAGCTTCT GTACAGCATG CTACCGGGAA GGTCGAACAG GGGACAGATT TATGAGACTT
GTTAAAAGCG GTGCAATTGC ACAGGTTTGT CATCCCAATG CAATTATGAC ATTAAAGGAA
TATCTGGAAG ACTATGCATC GGAAGATACA AGAGCAAAAG GTGAGAAAAT GATAGAAAAA
GAAGTGGAGC TACTGCAAAA CAGCGATGTT AAAAGAATCG TTAAAGAACA TTTAAGTGAC
CTCCATGAGG GTAAGAGGGA TTTCAGGTTC TAA
 
Protein sequence
MYNSKSKKAE DFINDEEILE TLEYARRNKE NMSLIEDILK KAAEYKGLSY REAAVLLECE 
LDEVKEKVFG LAEHIKKKFY GNRIVMFAPL YLSNYCVNEC RYCPYHGSNK HISRKQLSQE
DIVREVMALQ DMGHKRLALE TGEDPENCPI EYVLESIKTI YGIKHKNGAI RRVNVNIAAT
TIENYKKLKD AGIGTYILFQ ETYHKPTYEY LHPKGPKHNY AYHTEAMDRA MEGGIDDVGL
GVLFGLNLYK YDFVGLLMHA KHLEDAMGVG PHTISVPRIR PADDVDLKEY SNAIPDSIFE
KIVAILRIAV PYTGIIMSTR ESEKTRGECL KLGVSQISGG SSTSVGGYVE KEAENSAQFE
VNDTRTMDEV VNWLLTLGYI PSFCTACYRE GRTGDRFMRL VKSGAIAQVC HPNAIMTLKE
YLEDYASEDT RAKGEKMIEK EVELLQNSDV KRIVKEHLSD LHEGKRDFRF