Gene Moth_2027 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagMoth_2027 
Symbol 
ID3831402 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameMoorella thermoacetica ATCC 39073 
KingdomBacteria 
Replicon accessionNC_007644 
Strand
Start bp2116223 
End bp2117422 
Gene Length1200 bp 
Protein Length399 aa 
Translation table11 
GC content59% 
IMG OID637829956 
Productaerolysin 
Protein accessionYP_430866 
Protein GI83590857 
COG category[O] Posttranslational modification, protein turnover, chaperones 
COG ID[COG1404] Subtilisin-like serine proteases 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones52 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones27 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
GTGTGGCCGG TTTTACTGGC TTATTATACC GGCGTAGCGA CGGCGGCAGG TGCCATGATC 
GCCTGGTCCG GCAACAAAAA AGTAGGTAAG CGGAAGATTA TCCTTTTTCA CCAGGAGAAA
CCCCTACCCA GTTGCCGCGC CCTGTTGCAA AAGAGGGGCG GCCAGGTATT AAAGGAATTA
CCCCTGGTGC ATGCCCTTGT GGCCCGACTA CCTGCAAAGG GAAAAGTAAT TGAAGAACTG
GGTTTACACC CTGACATTCG CCTGATTGAA GACGATTTTG AAGTTCACAC GGTGGAGTTG
CCGGCCGCCC GGATACGGCA GGAGAAACAG ATAGTTCCCT GGGGCGTTGA GCGCATCGGC
GCTCCGAGGG CCTGGCAGGT GGCTGCCGGG GAAAAGGTGA AGGTGGCGGT CCTAGACACA
GGCCTTGATG CCGGGCATCC CGACCTGGCG GCCAACGTCC GCGGTACCCA GAATATAAAA
TTTCCCGGCT GGCGGGCCGG AGACGGGAAC GGCCACGGTA CCCATGTAGC AGGGATTATT
GCCGCCCTGA ACAACAGCTT CGGGGTGGTA GGGGTTGCGC CCCGGGCCGA GATTTATGGA
GTAAAGATTT TTAACCGTCA AGGTGACGGT TATATATCCG ATATCGTAGC CGGCCTGGAC
TGGGCGCTAA AAAATAAGAT GCAGGTAGTA AACATGAGCT TTGGCACCAG CCAACCCAGC
CAGGCCCTGG AGGAGGCCGT CCGCAAATGT GTCCAGGCGG GAATGGTGCT GGTTGCGGCG
GCCGGAAACG AAGGCAGGGA CGATAGTGTT CTATATCCGG CCCGCTATCC GGGGGTCATC
GCCGTTTCGG CCGTCGATAA GAAGGATAAC CTGGCCAGCT TTAGCAGCCG GGGAACGGAG
GTAACGGTCA CTGCTCCCGG AGTCGATATC CTATCTACTT ACCCGGGGGG CAAATACCGG
ACTATGAGCG GAACTTCCAT GGCCTGCCCC CACGCCGCCG GGGTGGCGGC CCTGATTCTG
GCCCAGGATA GGCGCCTATC CGGCCGGCAG GTGGCCAGGA TAATCTGCCG CACGGCCATT
AAGTTACCCG ATCTGTCACC CCGGGAACAG GGGGACGGTT TGGTCAACGC CACCGCCCTG
GCGGCCGTAG CCCGCTTTAT TGGGGAAACG GGAACGGCAG GCGAGGAGTC GGTCGGCTAA
 
Protein sequence
MWPVLLAYYT GVATAAGAMI AWSGNKKVGK RKIILFHQEK PLPSCRALLQ KRGGQVLKEL 
PLVHALVARL PAKGKVIEEL GLHPDIRLIE DDFEVHTVEL PAARIRQEKQ IVPWGVERIG
APRAWQVAAG EKVKVAVLDT GLDAGHPDLA ANVRGTQNIK FPGWRAGDGN GHGTHVAGII
AALNNSFGVV GVAPRAEIYG VKIFNRQGDG YISDIVAGLD WALKNKMQVV NMSFGTSQPS
QALEEAVRKC VQAGMVLVAA AGNEGRDDSV LYPARYPGVI AVSAVDKKDN LASFSSRGTE
VTVTAPGVDI LSTYPGGKYR TMSGTSMACP HAAGVAALIL AQDRRLSGRQ VARIICRTAI
KLPDLSPREQ GDGLVNATAL AAVARFIGET GTAGEESVG