Gene Moth_1401 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagMoth_1401 
Symbol 
ID3831688 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameMoorella thermoacetica ATCC 39073 
KingdomBacteria 
Replicon accessionNC_007644 
Strand
Start bp1448822 
End bp1450120 
Gene Length1299 bp 
Protein Length432 aa 
Translation table11 
GC content63% 
IMG OID637829337 
Productthiamine biosynthesis protein ThiC 
Protein accessionYP_430257 
Protein GI83590248 
COG category[H] Coenzyme transport and metabolism 
COG ID[COG0422] Thiamine biosynthesis protein ThiC 
TIGRFAM ID[TIGR00190] thiamine biosynthesis protein ThiC 


Plasmid Coverage information

Num covering plasmid clones39 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones27 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGACGCAAC TGGAAGCAGC CCAGGTGGGC CAGGTGACGC GGGCGATGGA ACAGGTGGCG 
GCCCGGGAGA AGGTGCGTGT AGAGGACCTA ATGGCAGAGG TAGCCGCCGG CCGGGTGGTG
ATACCGGTCA ATAAGAATCA CCATAAACTC CAGCCATGCG GTATTGGCAG GGGGTTAAGG
ACCAAGGTTA ACGCCAACCT GGGTACCTCC ACGGACTACC CGGATATCGC CGCCGAGTTG
GAGAAGCTCC AGGTGGCCCT GGACGCCGGG GCTGATGCCG TCATGGATCT AAGCACCGGC
GGTGACATTA ACGAATGCCG GCGCCAGGTC ATTGCCCGCT CGCCGGCGAC CGTCGGGACT
GTGCCCATTT ACCAGGCTAC GGTGGAGGCC CAGGAGAAAT ACGGCGCTCT GGTAAAAATG
ACCGTTGACG ACCTCTTCCG GGTTATCGAA ATGCAGGCTG AAGACGGTGT TGATTTTATT
ACCGTTCACT GCGGTGTCAC CATGGAGGTA GTTGAGCGCC TGCGCCGCGA GGGCCGCCTG
GCGGATATCG TCAGTCGGGG CGGATCTTTC CTGACAGGCT GGATGCTCCA TAATGAACAG
GAGAATCCCC TCTACGCCCA TTACGACCGC CTGCTGGAGA TCGCCCGGCG CTATGATGTC
ACCTTAAGCC TGGGCGACGG CCTGCGGCCG GGTTGCCTGG CTGACGCCAC CGACCGGGCC
CAGATCCAGG AGTTGATTAT CCTGGGAGAG CTGGTGGATC GCGCCCGGGA AGCCGGTGTC
CAGGCCATGG TGGAAGGACC CGGACACGTA CCCTTAAACC AGATCCAGGC TAATATCCTC
CTGGAGAAAC GCCTTTGCCA CGAAGCGCCC TTCTACGTCC TGGGACCCCT GGTCACTGAC
GTCGCGCCGG GATACGATCA CCTTACTGCC GCCATCGGCG GCGCCCTGGC GGCTGCTGCC
GGGGCCGATT TTATCTGCTA CGTTACCCCG GCCGAACATC TGGGCCTGCC CACCCTGGCC
GATGTGCGGG AAGGAGTGAT CGCCGCCCGC ATTGCCGGCC ATGCCGCCGA CCTGGCCAAA
GGCCTTCCCG GGGCCTGGGA ATGGGACCGG GAGATGGCCC GCGCCCGCAA GGCCCTGGAC
TGGCAGCGCC AAATAGAGCT GGCCCTGGAC CCGGAAAAGG CCAGGCAGTA CCGCCGGGCC
CGCAACGACG AGGGGGCCGT TGCCTGCTCT ATGTGCGGTG ACTTCTGCGC CATGCGCCTC
GTCGGAGAGT ACCTGGGGAA ACCGTCAGAA ACGTGTTAA
 
Protein sequence
MTQLEAAQVG QVTRAMEQVA AREKVRVEDL MAEVAAGRVV IPVNKNHHKL QPCGIGRGLR 
TKVNANLGTS TDYPDIAAEL EKLQVALDAG ADAVMDLSTG GDINECRRQV IARSPATVGT
VPIYQATVEA QEKYGALVKM TVDDLFRVIE MQAEDGVDFI TVHCGVTMEV VERLRREGRL
ADIVSRGGSF LTGWMLHNEQ ENPLYAHYDR LLEIARRYDV TLSLGDGLRP GCLADATDRA
QIQELIILGE LVDRAREAGV QAMVEGPGHV PLNQIQANIL LEKRLCHEAP FYVLGPLVTD
VAPGYDHLTA AIGGALAAAA GADFICYVTP AEHLGLPTLA DVREGVIAAR IAGHAADLAK
GLPGAWEWDR EMARARKALD WQRQIELALD PEKARQYRRA RNDEGAVACS MCGDFCAMRL
VGEYLGKPSE TC