Gene Moth_1722 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagMoth_1722 
Symbol 
ID3833022 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameMoorella thermoacetica ATCC 39073 
KingdomBacteria 
Replicon accessionNC_007644 
Strand
Start bp1765892 
End bp1767172 
Gene Length1281 bp 
Protein Length426 aa 
Translation table11 
GC content59% 
IMG OID637829647 
Productthiamine biosynthesis protein ThiC 
Protein accessionYP_430567 
Protein GI83590558 
COG category[H] Coenzyme transport and metabolism 
COG ID[COG0422] Thiamine biosynthesis protein ThiC 
TIGRFAM ID[TIGR00190] thiamine biosynthesis protein ThiC 


Plasmid Coverage information

Num covering plasmid clones13 
Plasmid unclonability p-value0.00870726 
Plasmid hitchhikingYes 
Plasmid clonabilityhitchhiker 
 

Fosmid Coverage information

Num covering fosmid clones16 
Fosmid unclonability p-value0.564232 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAGCCAGG TTTTAGATGC CCGTGCTGGA AAAATTACTC CGGAAATGGA AAAAGTGGCG 
GCAGATGAAA AGGTGGATGT GGAATTTGTC CGGGCAGGGG TGGCGGAGGG GACTATTGTA
ATCCCCCGGA ATACCAACCG GAAGGTCCTT AAACCCTGCG GTATCGGGAG GGGTTTACGT
ATCAAGGTGA ATGCCCTGAT CGGGACCTCC AGCGACCGGG ATGACCGGCA AATGGAGATG
CGGAAGATCG CTGCGGCCGA GGCGGCAGGG TGTGATTCCT TTATGGATTT AAGCACCGGC
GGGGATATCG ATGAGATGCG GCGGCTCACC CTTGCCCACG CCAGGGTTCC GGTAGGCAGC
GTGCCCATTT ATCAGGCAGC CATCGAAGCC ATTGAAAAGC GGGGCAGTAT TGTAGCGATG
ACGCCGGACG ACATGTTTGC GGCCGTGGAA AAACAGGCAA GGGACGGGAT AGATTTCATG
GCCATTCACA GCGCCCTGAA TTTCGAGATC CTCGAAAGGC TTCAGGCTAG CGGCAGGGTG
ACCGACATTG TCAGCCGCGG TGGGGCCTTC CTCACCGGCT GGATGCTGCA CAACCAGAAA
GAGAATCCCC TTTATGAGCA GTTCGACAGG TTGCTCGAAA TCTTGCTGAA GTACGATGTC
ACCCTCAGCC TCGGCGACGC CATTCGTCCG GGTTCTACAG CCGACTCCCT GGACGGGGCC
CAACTGCAGG GAATGATCGT GGCCGGGGAA CTGGTCAGGC GCGCCAGGGA AGCCGGCGTG
CAGGTTATGG TCGAGGGTCC GGGACATGTT CCCCTCAACC ATGTGGAAAC GACAATGAAA
CTACAGAAAA GCCTGTGCGG GGGCGCGCCT TACTTTATTC TGGGTACCCT GGCTACTGAT
GTGGCGCCGG GATATGACCA TATCACTGCC GCAATAGGGG GTGCCCTTGC CGGGACGGTT
GGGGCGGATT TTATCTGCTA TGTGACACCG GCGGAGCATC TGGGGTTACC AACAGAGCAG
GACGTTAAAG AAGGGGTGAT TGCCGCCCGC ATTGCCGCCC ATGCCGCCGA TCTGGCCAGG
GGAAACAGGC AGGCCTGGGA GCGGGATCTG CAAATGGCGC GGGCGCGGGT CGCCCTCGAT
GTGGAAAAGC AGATAAGCCT TGCCATTGAT CAGGAAAAGG CACGCTCGTT GCTCGACGGT
ACCGGGGAAG ACGGGGTTTG TGCTGCCTGT GGGACGAACT GCGCAGCCCT GGTGGCCGCC
CGTTATTTCG GGATGAACTG A
 
Protein sequence
MSQVLDARAG KITPEMEKVA ADEKVDVEFV RAGVAEGTIV IPRNTNRKVL KPCGIGRGLR 
IKVNALIGTS SDRDDRQMEM RKIAAAEAAG CDSFMDLSTG GDIDEMRRLT LAHARVPVGS
VPIYQAAIEA IEKRGSIVAM TPDDMFAAVE KQARDGIDFM AIHSALNFEI LERLQASGRV
TDIVSRGGAF LTGWMLHNQK ENPLYEQFDR LLEILLKYDV TLSLGDAIRP GSTADSLDGA
QLQGMIVAGE LVRRAREAGV QVMVEGPGHV PLNHVETTMK LQKSLCGGAP YFILGTLATD
VAPGYDHITA AIGGALAGTV GADFICYVTP AEHLGLPTEQ DVKEGVIAAR IAAHAADLAR
GNRQAWERDL QMARARVALD VEKQISLAID QEKARSLLDG TGEDGVCAAC GTNCAALVAA
RYFGMN