Gene Moth_0736 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagMoth_0736 
Symbol 
ID3831128 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameMoorella thermoacetica ATCC 39073 
KingdomBacteria 
Replicon accessionNC_007644 
Strand
Start bp768423 
End bp769583 
Gene Length1161 bp 
Protein Length386 aa 
Translation table11 
GC content60% 
IMG OID637828667 
Productpeptidase S1 and S6, chymotrypsin/Hap 
Protein accessionYP_429597 
Protein GI83589588 
COG category[O] Posttranslational modification, protein turnover, chaperones 
COG ID[COG0265] Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain 
TIGRFAM ID[TIGR02037] periplasmic serine protease, Do/DeqQ family 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.00000000182436 
Plasmid hitchhikingNo 
Plasmid clonabilitydecreased coverage 
 

Fosmid Coverage information

Num covering fosmid clones17 
Fosmid unclonability p-value0.877655 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGTGGCGAC GAGATGGACG AGCAAGGATT TCCGCGCTGC TGGTCTTATT AATTTTCCTG 
GCCGGGGTAG CGGCTACGGC GGGTTTCTAT CATATCACCG GCGCCACAGC CGGGCCCCAA
CAGGCGTACC AGAATGCTGT TAGCACAGCC CAGCCGGCCT CGGCCAGCAT CCCGGCGGGA
CTGGGACCGG AGACCATTGC CGATATCGTT GACAAAACCG GCCCGGCGGT AGTGCGTATC
GACACGGTGA CGGAAACCCA GGGCAGCAGC CCCTTCAATG ACCCCTTCTT CCGGCAGTTT
TTCGGCGACC AGTTTAATAC CGGCCCTCAG GTCCAGCGAG CCCTGGGTTC GGGCTTTATT
ATCTCCAGCG ACGGTTATAT CCTGACCAAC CAGCACGTCG TCGAGGGCGC CAGGCAGGTC
AAGGTAACTA TCGTCGGTTT TGACAAACCC CTGAATGCCC AGGTGATCGG CGCCGACAGT
TCTCTGGACC TGGCGGTTTT GAAGGTCGAT GCGGGTAAAC CCCTGCCTTA CCTGGCCTTG
GGGGATACCA ACAAGGTACG GGTTGGGGAC TGGGCCATCG CCATCGGCAA TCCTGACGGA
CTGGACCATA CCGTCACCGT CGGTGTAATT AGCGCCAAGG GACGGCCCAT AGACGTCCAG
AACCGCCATT ATGAAAACCT GCTGCAGACG GACGCCGCTA TTAACCCCGG CAACAGCGGC
GGTCCCCTCC TGAACCTTAA AGGCGAAGTA ATCGGCATCA ATACCGCCGT TAACGCCGAC
GCCCAGGGAA TCGGCTTCGC CATTCCCAGC AGCACGGTCC AGCCGGTCCT CAAGGACCTC
ATGACCAAGG GCAAGATTAG CCGGCCCTGG CTGGGGGTGG CCCTGCAACA GGTAACCCCG
GACGTGGCCG ACATCCTGGG CCTCCAGGGC CAGGAAGGCG CCGTGGTAGT CCAGGTGGTG
AGCGGTAGCC CGGCCGCCAA AGCGGGCCTC CAAAAATATG ACGTGATCCT GCAGGTTGAT
GGCCAGGCAG TAAAGGACGC CAGTGACCTG GTGAATAAGA TCCAGAGTAT GAAGATTGGC
CAGCAGGTAC AGCTCCAGGT CTTCCGCCGC GGTCAGACCT TAAATATCAG CGTAGTCCTG
GGGGAAAAGC CGGCCCAGTA G
 
Protein sequence
MWRRDGRARI SALLVLLIFL AGVAATAGFY HITGATAGPQ QAYQNAVSTA QPASASIPAG 
LGPETIADIV DKTGPAVVRI DTVTETQGSS PFNDPFFRQF FGDQFNTGPQ VQRALGSGFI
ISSDGYILTN QHVVEGARQV KVTIVGFDKP LNAQVIGADS SLDLAVLKVD AGKPLPYLAL
GDTNKVRVGD WAIAIGNPDG LDHTVTVGVI SAKGRPIDVQ NRHYENLLQT DAAINPGNSG
GPLLNLKGEV IGINTAVNAD AQGIGFAIPS STVQPVLKDL MTKGKISRPW LGVALQQVTP
DVADILGLQG QEGAVVVQVV SGSPAAKAGL QKYDVILQVD GQAVKDASDL VNKIQSMKIG
QQVQLQVFRR GQTLNISVVL GEKPAQ