Gene Moth_1734 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagMoth_1734 
SymbolthiH 
ID3833034 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameMoorella thermoacetica ATCC 39073 
KingdomBacteria 
Replicon accessionNC_007644 
Strand
Start bp1784737 
End bp1786140 
Gene Length1404 bp 
Protein Length467 aa 
Translation table11 
GC content61% 
IMG OID637829658 
Productthiamine biosynthesis protein ThiH 
Protein accessionYP_430578 
Protein GI83590569 
COG category[H] Coenzyme transport and metabolism
[R] General function prediction only 
COG ID[COG1060] Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 
TIGRFAM ID[TIGR02351] thiazole biosynthesis protein ThiH 


Plasmid Coverage information

Num covering plasmid clones23 
Plasmid unclonability p-value0.908145 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones17 
Fosmid unclonability p-value0.683505 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCAACACG GTTATCGCGC CGATTTTATC AATCATGAAG AGATAGAAGG CTACCTGGAA 
GAAGCTAAAC GGGCAACGAG GGATGTGGCT GTCAGGATTA TCGAAAAGGC GCGGGAAGCG
AAGGGGCTGG AGCCCTACGA GGTGGCGGTT TTGCTCCAGA ACGACGATGC GGACGTACGC
CGGCGGATGT TTACCGCCGC CCGGGAGATA AAGGAAAAGA TCTACGGCCA GCGGATAGTT
CTTTTTGCAC CTCTGTACTT CAGCGACTAC TGCATTAACA ACTGCCGCTA CTGCGGCTAC
CGGCGGGAAA ATAAGTTCGA ACGCCGCCGC CTGGAGCCGG AGGAACTGGA ACGGGAGGTG
CGCATCCTGG AATCCCTGGG GCATAAGCGC CTGGCCCTGG AGGCCGGGGA GGATCCCGTC
CATTGTCCCC TTGAATATAC CCTGGATGTT ATTAACCGCA TTTACCGCAT CACCGAAGCC
AACGGCAGCA TCCGGCGGGT AAACGTCAAC ATCGCGGCGA CGACGGTGGA TGCCTACAGG
CAGTTAAAGG CCGCCGGCAT CGGCACCTAC GTCCTCTTCC AGGAGACCTA CCACCGGCCT
ACTTATGCCT ACATGCACCC CGGCGGCCCC AAGGCGGACT ACGACTGGCA CACCACGGCC
ATGGACCGGG CCATGGAGGG CGGCATCGAC GACGTCGGCC TGGGGGTCCT CTTCGGCCTC
TACGATTATA AATTCGAAGT CATGGGCCTG CTCTACCATG CCCGGCACTT GGAGGAGACC
TTCGGCGTCG GCCCCCATAC CATCTCCGTA CCGCGCCTGC GGCCGGCCTA CAACATTACC
CTGGAAAAAT TCCCTTACCT GGTTGACGAC GAAGATTTTA AGAAACTGGT GGCCATCATC
CGCCTGGCCG TGCCCTATAC CGGCATGATC ATCTCCACCC GGGAGACGGC GGAGCTCAGG
GCGGAACTCC TGGAGTTGGG CGTTTCCCAG ATCAGCGCCG GCTCCTGTAC GGGGGTAGGG
GGCTATGGCC GTCACTATGC CGATCAGGAA GACGATATCC CCCAGTTTGA AATCGGCGAC
CACCGCCACC CCGATGAGGT TATCGGCGAC CTCTGCCGGC GGGGGTATCT CCCCAGCTAC
TGCACAGCCT GCTACCGCCG CGGCCGCACC GGCGACCGCT TCATGTCCCT GGCCAAAACC
GGGGAGATCC AGCACTGCTG CCAGCCTAAC GCCATCCTCA CCTTTAAGGA ATACTTGCTG
GATTATGCCC GCCCGGCTAC CAGGGAAGTA GGAGAGACAA CCATCAGGGA GCACCTGGCC
CGGATCCCCA GCCCGGCCAT CCGGGCCGAA ACGGAACGCC GCCTGGAGCG CATCGCCGCC
GGCGAGCGGG ATTTGTATTT CTAG
 
Protein sequence
MQHGYRADFI NHEEIEGYLE EAKRATRDVA VRIIEKAREA KGLEPYEVAV LLQNDDADVR 
RRMFTAAREI KEKIYGQRIV LFAPLYFSDY CINNCRYCGY RRENKFERRR LEPEELEREV
RILESLGHKR LALEAGEDPV HCPLEYTLDV INRIYRITEA NGSIRRVNVN IAATTVDAYR
QLKAAGIGTY VLFQETYHRP TYAYMHPGGP KADYDWHTTA MDRAMEGGID DVGLGVLFGL
YDYKFEVMGL LYHARHLEET FGVGPHTISV PRLRPAYNIT LEKFPYLVDD EDFKKLVAII
RLAVPYTGMI ISTRETAELR AELLELGVSQ ISAGSCTGVG GYGRHYADQE DDIPQFEIGD
HRHPDEVIGD LCRRGYLPSY CTACYRRGRT GDRFMSLAKT GEIQHCCQPN AILTFKEYLL
DYARPATREV GETTIREHLA RIPSPAIRAE TERRLERIAA GERDLYF