Gene Moth_2157 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagMoth_2157 
Symbol 
ID3833006 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameMoorella thermoacetica ATCC 39073 
KingdomBacteria 
Replicon accessionNC_007644 
Strand
Start bp2261054 
End bp2262082 
Gene Length1029 bp 
Protein Length342 aa 
Translation table11 
GC content63% 
IMG OID637830079 
ProductO-sialoglycoprotein endopeptidase 
Protein accessionYP_430989 
Protein GI83590980 
COG category[O] Posttranslational modification, protein turnover, chaperones 
COG ID[COG0533] Metal-dependent proteases with possible chaperone activity 
TIGRFAM ID[TIGR00329] metallohydrolase, glycoprotease/Kae1 family 


Plasmid Coverage information

Num covering plasmid clones52 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones
Fosmid unclonability p-value0.0185524 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGTAAAGG AATTGGCGAC AGAAACGAAT ATCCTGGCCA TCGAGAGTTC CTGTGATGAG 
ACGGCGGCGG CCATTGTCAG CGACGGCACC AGGGTCCGGG CCAACATCAT CGCCTCCCAG
ATCGCCGTTC ACCGCCGCTT TGGCGGCGTG GTGCCGGAAA TAGCTTCCCG CCACCATATG
GAGAATATAG TACCGGTGGT ATCGGAGGCC CTGGCTACAG CCGGCCTGGC CTTTAGCGAT
GTGGACGCCG TGGCGGTGAC CTATGGTCCC GGACTGGTAG GGGCCCTGCT GGTGGGTGTC
GCTTACGCCA AGAGCCTGGC CTACGCCCTG GGTAAGCCCC TCATCGGTGT CCACCACCTC
CTGGGGCATA TCTATGCCGG TTTTCTGGCC TACCCTGGCC TGCCCTTGCC GGCGGTCTCC
CTGGTGGTCT CGGGCGGGCA TACCAACCTG GTCTACCTGG AGGATCACAC CACCCGTCGT
ATCCTGGGGT CAACCCGGGA TGACGCCGCC GGGGAAGCCT TCGACAAGGT GGCCAGGGTC
CTGGGGTTGC CCTATCCGGG CGGGCCGGAG CTGGAAAAAC TGGCCCGGGA AGGCAATCCC
CGGGCCATTC CTTTCCCCCG GGCCTGGCTG GAGGAAAACA GCCTTGATTT CAGCTTTAGC
GGCCTGAAAT CTGCGGTCAT CAACTACCTG CACCACGCCC GCCAGGTGGG CCAGGAGGTT
AACCGGGCCG ACGTGGCTGC CAGTTTCCAG GCGGCGGTGG CCGAGGTCCT GGTGACCAAG
ACCCTGCTGG CGGCTACCAG CTACCGGGCC AGGTCTATCC TTCTCGCCGG TGGGGTGGCG
GCCAATTCGG TCCTGCGCCG GGAACTTCGT TCAGCCGGGG AGCAGGCGGG CCTCCCGGTC
TTTTTTCCAC CGCGGGAACT CTGCACCGAC AACGCGGCCA TGATCGGCTG TGCCGCTTAT
TACCAGTACC TGCGCCGGGA TTTTGCCCCT TTAAGCCTCA ACGCTATCCC CGATTTACCC
CTTAATTGA
 
Protein sequence
MVKELATETN ILAIESSCDE TAAAIVSDGT RVRANIIASQ IAVHRRFGGV VPEIASRHHM 
ENIVPVVSEA LATAGLAFSD VDAVAVTYGP GLVGALLVGV AYAKSLAYAL GKPLIGVHHL
LGHIYAGFLA YPGLPLPAVS LVVSGGHTNL VYLEDHTTRR ILGSTRDDAA GEAFDKVARV
LGLPYPGGPE LEKLAREGNP RAIPFPRAWL EENSLDFSFS GLKSAVINYL HHARQVGQEV
NRADVAASFQ AAVAEVLVTK TLLAATSYRA RSILLAGGVA ANSVLRRELR SAGEQAGLPV
FFPPRELCTD NAAMIGCAAY YQYLRRDFAP LSLNAIPDLP LN