Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Moth_0736 |
Symbol | |
ID | 3831128 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Moorella thermoacetica ATCC 39073 |
Kingdom | Bacteria |
Replicon accession | NC_007644 |
Strand | + |
Start bp | 768423 |
End bp | 769583 |
Gene Length | 1161 bp |
Protein Length | 386 aa |
Translation table | 11 |
GC content | 60% |
IMG OID | 637828667 |
Product | peptidase S1 and S6, chymotrypsin/Hap |
Protein accession | YP_429597 |
Protein GI | 83589588 |
COG category | [O] Posttranslational modification, protein turnover, chaperones |
COG ID | [COG0265] Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain |
TIGRFAM ID | [TIGR02037] periplasmic serine protease, Do/DeqQ family |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 2 |
Plasmid unclonability p-value | 0.00000000182436 |
Plasmid hitchhiking | No |
Plasmid clonability | decreased coverage |
| |
Fosmid Coverage information |
Num covering fosmid clones | 17 |
Fosmid unclonability p-value | 0.877655 |
Fosmid Hitchhiker | No |
Fosmid clonability | normal |
| |
Sequence |
Gene sequence | ATGTGGCGAC GAGATGGACG AGCAAGGATT TCCGCGCTGC TGGTCTTATT AATTTTCCTG GCCGGGGTAG CGGCTACGGC GGGTTTCTAT CATATCACCG GCGCCACAGC CGGGCCCCAA CAGGCGTACC AGAATGCTGT TAGCACAGCC CAGCCGGCCT CGGCCAGCAT CCCGGCGGGA CTGGGACCGG AGACCATTGC CGATATCGTT GACAAAACCG GCCCGGCGGT AGTGCGTATC GACACGGTGA CGGAAACCCA GGGCAGCAGC CCCTTCAATG ACCCCTTCTT CCGGCAGTTT TTCGGCGACC AGTTTAATAC CGGCCCTCAG GTCCAGCGAG CCCTGGGTTC GGGCTTTATT ATCTCCAGCG ACGGTTATAT CCTGACCAAC CAGCACGTCG TCGAGGGCGC CAGGCAGGTC AAGGTAACTA TCGTCGGTTT TGACAAACCC CTGAATGCCC AGGTGATCGG CGCCGACAGT TCTCTGGACC TGGCGGTTTT GAAGGTCGAT GCGGGTAAAC CCCTGCCTTA CCTGGCCTTG GGGGATACCA ACAAGGTACG GGTTGGGGAC TGGGCCATCG CCATCGGCAA TCCTGACGGA CTGGACCATA CCGTCACCGT CGGTGTAATT AGCGCCAAGG GACGGCCCAT AGACGTCCAG AACCGCCATT ATGAAAACCT GCTGCAGACG GACGCCGCTA TTAACCCCGG CAACAGCGGC GGTCCCCTCC TGAACCTTAA AGGCGAAGTA ATCGGCATCA ATACCGCCGT TAACGCCGAC GCCCAGGGAA TCGGCTTCGC CATTCCCAGC AGCACGGTCC AGCCGGTCCT CAAGGACCTC ATGACCAAGG GCAAGATTAG CCGGCCCTGG CTGGGGGTGG CCCTGCAACA GGTAACCCCG GACGTGGCCG ACATCCTGGG CCTCCAGGGC CAGGAAGGCG CCGTGGTAGT CCAGGTGGTG AGCGGTAGCC CGGCCGCCAA AGCGGGCCTC CAAAAATATG ACGTGATCCT GCAGGTTGAT GGCCAGGCAG TAAAGGACGC CAGTGACCTG GTGAATAAGA TCCAGAGTAT GAAGATTGGC CAGCAGGTAC AGCTCCAGGT CTTCCGCCGC GGTCAGACCT TAAATATCAG CGTAGTCCTG GGGGAAAAGC CGGCCCAGTA G
|
Protein sequence | MWRRDGRARI SALLVLLIFL AGVAATAGFY HITGATAGPQ QAYQNAVSTA QPASASIPAG LGPETIADIV DKTGPAVVRI DTVTETQGSS PFNDPFFRQF FGDQFNTGPQ VQRALGSGFI ISSDGYILTN QHVVEGARQV KVTIVGFDKP LNAQVIGADS SLDLAVLKVD AGKPLPYLAL GDTNKVRVGD WAIAIGNPDG LDHTVTVGVI SAKGRPIDVQ NRHYENLLQT DAAINPGNSG GPLLNLKGEV IGINTAVNAD AQGIGFAIPS STVQPVLKDL MTKGKISRPW LGVALQQVTP DVADILGLQG QEGAVVVQVV SGSPAAKAGL QKYDVILQVD GQAVKDASDL VNKIQSMKIG QQVQLQVFRR GQTLNISVVL GEKPAQ
|
| |