Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Moth_1664 |
Symbol | thiH |
ID | 3831935 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Moorella thermoacetica ATCC 39073 |
Kingdom | Bacteria |
Replicon accession | NC_007644 |
Strand | + |
Start bp | 1697298 |
End bp | 1698428 |
Gene Length | 1131 bp |
Protein Length | 376 aa |
Translation table | 11 |
GC content | 54% |
IMG OID | 637829589 |
Product | thiamine biosynthesis protein ThiH |
Protein accession | YP_430509 |
Protein GI | 83590500 |
COG category | [H] Coenzyme transport and metabolism [R] General function prediction only |
COG ID | [COG1060] Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes |
TIGRFAM ID | [TIGR02351] thiazole biosynthesis protein ThiH |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 20 |
Plasmid unclonability p-value | 0.198557 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | 6 |
Fosmid unclonability p-value | 0.0013298 |
Fosmid Hitchhiker | Yes |
Fosmid clonability | hitchhiker |
| |
Sequence |
Gene sequence | ATGGGTTTTT ACGACGTCTA CAAGCAGTAT GAGGGGTTTG ATTTCGAAGG CTTTTTCCAG AGTAGGACCC CTGACGACGT CAGGAAGGCC CTGGCAAAGG AGCACCTCGA GGTAACCGAT TACCTGACCC TTCTATCGCC CGCGGCAGGA AATTTCCTGG AGGAAATGGC CCAAAAAGCC CACCGTATAA CCCTGAGGAA TTTCGGCCGG GTCATATTTC TCTTTACACC GTTATACCTG TCCGACTACT GCGTGAACCA GTGCGCCTAC TGCAGTTTCA ATGCCCGGAA TAAATTTGCC CGGACCAAGC TCACCTTAGA GCAGGTCGAA GAAGAAGCCA GGGCCATAGC CCAAACAGGA ATGAAAGATA TCCTCATCCT GACGGGAGAA TCGCGCCAGC ACAATCCGGT GTCGTATATA AAGGACTGCG TCGGTGTTTT AAAGAAGTAT TTCTGCAGTA TTTGCATAGA AGTCTATCCC CTGGAAGAAG AGGAGTACCG GGAGCTGGTA GCAGCCGGGG TGGATGGCCT CACCATGTTT CAGGAAGTCT ATGACCCCGG AGTCTACGCC AGGTACCATA ACGGTCCCAA GAAAAATTAC CATTACCGGC TGGACGCCCC GGAAAGGAGC TGCCGGGCGG GTATGCGGAC CGTGGGTGTC GGGGCCCTGC TGGGCCTGGC CGACTGGCGG AAGGAGGCCT TCTTCACCGG ACTGCACGCC GATTATTTGC AGCAAAAGTT CTGGGATGTG CAGGTCAGTA TCTCTTTGCC CAGATTTCGC CCTAGTATCG GCGGCTTTCA ACCCGACTAC CCGGTGGACG ACAAGAGCTT CGTCCAGATC CTCCTGGCCC ACAGGCTGTT TTTACCCCGG GTCGGCATAA CCATTTCCAC CAGGGAAAGC CCCGAGTTCC GGGACAACAT CCTACCCCTG GGTGTCACGA AAATATCGGC CGGTTCTTCC GTTACGGTGG GAGGCTATGC CCGTCCTGAC GGCATGGCAC CCCAGTTTGA AATATCCGAC CCGCGTAGTG TAGCGGAAAT AAAACAAATG CTAATCCAGA AGGGCTACCA GCCGGTTTTC GAAGACTGGC AGCAGTGGGA TAGCCTGGAG AAACAGCTAT ATAATTTCTA G
|
Protein sequence | MGFYDVYKQY EGFDFEGFFQ SRTPDDVRKA LAKEHLEVTD YLTLLSPAAG NFLEEMAQKA HRITLRNFGR VIFLFTPLYL SDYCVNQCAY CSFNARNKFA RTKLTLEQVE EEARAIAQTG MKDILILTGE SRQHNPVSYI KDCVGVLKKY FCSICIEVYP LEEEEYRELV AAGVDGLTMF QEVYDPGVYA RYHNGPKKNY HYRLDAPERS CRAGMRTVGV GALLGLADWR KEAFFTGLHA DYLQQKFWDV QVSISLPRFR PSIGGFQPDY PVDDKSFVQI LLAHRLFLPR VGITISTRES PEFRDNILPL GVTKISAGSS VTVGGYARPD GMAPQFEISD PRSVAEIKQM LIQKGYQPVF EDWQQWDSLE KQLYNF
|
| |