Gene Moth_2044 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagMoth_2044 
Symbol 
ID3831190 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameMoorella thermoacetica ATCC 39073 
KingdomBacteria 
Replicon accessionNC_007644 
Strand
Start bp2134455 
End bp2135996 
Gene Length1542 bp 
Protein Length513 aa 
Translation table11 
GC content65% 
IMG OID637829973 
Productphosphoribosylaminoimidazolecarboxamide formyltransferase / IMP cyclohydrolase 
Protein accessionYP_430883 
Protein GI83590874 
COG category[F] Nucleotide transport and metabolism 
COG ID[COG0138] AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) 
TIGRFAM ID[TIGR00355] phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP cyclohydrolase 


Plasmid Coverage information

Num covering plasmid clones42 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones15 
Fosmid unclonability p-value0.441464 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGTCTAAAA GGGCTCTAAT CAGCGTCTCC GACAAAACCG GTCTGGTGGA ACTGGCCCGC 
GGCCTGGTGG AACTGGGCTG GGAACTCCTT TCCACCGGCG GCACGGCCCG CACCTTAATC
GCAGCCGGCC TGCCTGTGAC CGAAGTGGCA GCCGTCACCG GCTTTCCCGA GATCCTTGAC
GGCCGGGTCA AGACCCTGCA CCCGAAGATC CACGGCGGCA TCCTGGCCCG GCCGACCCCC
GAACACCTGG CCCAGCTCCA GGAGCAGGGC ATCCAGCCCA TCGACCTGGT GGTGGTCAAC
CTCTACCCCT TCCGGGAAAC TATCGCCCGG CCCGGAGTGA CGCCGGCGGA GGCCATTGAA
AACATCGACA TCGGCGGCCC GGCCATGGTC AGGGCGGCGG CCAAGAACCA CGAGAGGGTG
GGTATTGTCG TCGACCCGGC CAGTTACAAT GAGGTGTTAA CGGAACTGAG GGAGAAAGGA
AGCTTGAGCC CGGAGACCAG GCGCCGCCTG GCGGCGGCGG CCTTCGCCCA TACCGCCGCC
TACGACGCGG CCATTGCGGC CTATTTCCAG CGCCTCATGC GCAATGAAGA ACCCTTCCCG
GCCAGCTTCG TCTTAAGCGG TGAAAAGGTC CAGGACCTGC GCTACGGCGA GAACCCCCAC
CAGGGGGCCG CTTTCTACCG CTTGCCCGCC CCACCCCCCG GCACCCTGGC CGGGGCCCGG
CAGCTCCAGG GCAAGGAGCT TTCATATAAT AACCTCATGG ACCTGGACGC CGCCTGGAAC
CTGGCCTGTG ACTTCAAAGA ACCGGTGGTG GCGATCATCA AACATACCAA CCCCTGCGGG
GTCGCCCGGG CAAGTACCCC GGCTGCGGCC TACCGCCTGG CTTACGCTGC CGACCCCGTT
TCCGCCTTTG GCGGCATCGT CGCCTGCAAC CGGCCGGTGG ACGGCGAAAT GGCCGGGGCC
ATGACGGAGA TATTCCTGGA AGCAGTCATC GCCCCGTCTT TTACCCCGGA AGCCATGGCG
ATCCTGAAAA GCAAATCCAA CCTGCGCCTT CTGGCCGCGG GTGAGAGGGC GGGTTGCCGT
ACCCGGGAAT ACCAGATAAG GCCCGTCAGC GGCGGCTTCC TGGTCCAGGA GCCGGACTAC
CATGTCCTCG AGCCGGAGAG CCTCAAGGTG GTAACCGCCC GTAAACCTGA AGCCAAAGAG
ATGGCCGACC TGCTCTTCGC CTGGCAGGTA GTCAAACACG TCAAGTCCAA CGCCATCGTG
GTGGCCAGGG ACGGCGTCAC CCTGGGTATC GGCGCCGGCC AGATGAACCG GGTGGGGGCG
GCTCGCATCG CCCTGGAGCA GGCCGGGGCC CGGGCTAAAG GCGCCGTTCT GGCTTCCGAC
GCCTTCTTCC CCTTCGGCGA CACCGTGGAA CTGGCGGCCG GGGCCGGGAT CACAGCCATC
ATCCAGCCCG GCGGCTCTAT CCGCGACGAG GAGTCGATCA GGGCTGCCGA TGCCGCGGGT
ATAGCCATGG TTTTCACCGG CATCCGCCAC TTCCGGCATT AA
 
Protein sequence
MSKRALISVS DKTGLVELAR GLVELGWELL STGGTARTLI AAGLPVTEVA AVTGFPEILD 
GRVKTLHPKI HGGILARPTP EHLAQLQEQG IQPIDLVVVN LYPFRETIAR PGVTPAEAIE
NIDIGGPAMV RAAAKNHERV GIVVDPASYN EVLTELREKG SLSPETRRRL AAAAFAHTAA
YDAAIAAYFQ RLMRNEEPFP ASFVLSGEKV QDLRYGENPH QGAAFYRLPA PPPGTLAGAR
QLQGKELSYN NLMDLDAAWN LACDFKEPVV AIIKHTNPCG VARASTPAAA YRLAYAADPV
SAFGGIVACN RPVDGEMAGA MTEIFLEAVI APSFTPEAMA ILKSKSNLRL LAAGERAGCR
TREYQIRPVS GGFLVQEPDY HVLEPESLKV VTARKPEAKE MADLLFAWQV VKHVKSNAIV
VARDGVTLGI GAGQMNRVGA ARIALEQAGA RAKGAVLASD AFFPFGDTVE LAAGAGITAI
IQPGGSIRDE ESIRAADAAG IAMVFTGIRH FRH