Gene EcSMS35_3530 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_3530 
SymboldegQ 
ID6142665 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp3607668 
End bp3609035 
Gene Length1368 bp 
Protein Length455 aa 
Translation table11 
GC content52% 
IMG OID641618359 
Productserine endoprotease 
Protein accessionYP_001745506 
Protein GI170682050 
COG category[O] Posttranslational modification, protein turnover, chaperones 
COG ID[COG0265] Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain 
TIGRFAM ID[TIGR02037] periplasmic serine protease, Do/DeqQ family 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.00269541 
Plasmid hitchhikingYes 
Plasmid clonabilityhitchhiker 
 

Fosmid Coverage information

Num covering fosmid clones59 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAAAAAAC AAACCCAGCT GTTGAGTGCA TTAGCGTTAA GTGTCGGGTT AACTCTCTCG 
GCGTCATTTC AGGCCGTTGC GTCGATTCCA GGCCAGGTTG CCGATCAGGC CCCTCTCCCC
AGTCTGGCCC CAATGCTGGA AAAAGTGCTT CCGGCAGTGG TGAGCGTACG GGTGGAAGGA
ACGGCCAGTC AGGGACAGAA AATCCCGGAA GAATTCAAAA AGTTTTTTGG TGATGATTTA
CCGGATCAAC CTGCACAACC CTTCGAAGGT TTAGGCTCCG GTGTCATCAT CAACGCCAGT
AAAGGCTATG TGCTGACCAA CAACCATGTG ATTAATCAGG CACAGAAAAT CAGTATTCAG
CTCAATGATG GGCGCGAGTT TGATGCAAAA CTGATTGGTA GCGATGACCA GAGCGATATC
GCCCTGTTGC AAATTCAAAA CCCGAGCAAA TTAACGCAAA TCGCTATTGC CGACTCCGAT
AAATTGCGCG TCGGTGATTT TGCCGTGGCG GTCGGTAACC CATTTGGCCT TGGGCAAACC
GCCACCTCTG GCATTGTTTC CGCATTAGGC CGCAGCGGGT TGAATCTTGA AGGTCTTGAA
AACTTTATCC AGACAGATGC TTCCATTAAC CGCGGTAACT CCGGCGGTGC ACTGTTAAAC
CTTAACGGTG AGTTAATTGG CATCAACACT GCAATCCTTG CGCCTGGCGG TGGGAGCGTC
GGGATTGGAT TTGCCATCCC CAGTAATATG GCGCGAACAC TGGCGCAGCA GCTTATCGAC
TTCGGTGAAA TCAAACGCGG TCTGTTAGGC ATCAAAGGCA CAGAGATGAG TGCCGATATC
GCCAAAGCCT TCAACCTTGA CGTGCAGCGT GGCGCGTTTG TCAGCGAAGT GTTGCCAGGT
TCTGGCTCGG CAAAAGCGGG CATCAAAGCG GGCGATATTA TTACCAGCCT CAACGGCAAA
CCGCTGAATA GCTTTGCTGA GTTGCGCTCT CGTATCGCGA CCACCGAGCC GGGCACAAAA
GTGAAACTTG GCCTGCTGCG TAACGGCAAA CCACTGGAAG TAGAAGTGAC GCTTGATACC
AGCACCTCTT CGTCGGCCAG CGCTGAAATG ATCACGCCAG CGCTGGAAGG TGCAACGTTG
AGCGATGGTC AGCTAAAAGA TGGCGGCAAA GGTATTAAGA TCGATGAGGT TGTCAAAGGA
AGCCCAGCTG CTCAGGCTGG CTTGCAAAAA GACGATGTGA TCATTGGCGT CAACCGCGAT
CGGGTGAACT CGATTGCTGA AATGCGTAAA GTGCTGGCGG CAAAACCGGC CATCATCGCC
CTGCAAATTG TACGCGGCAA TGAAAGCATC TATCTGCTGA TGCGTTAA
 
Protein sequence
MKKQTQLLSA LALSVGLTLS ASFQAVASIP GQVADQAPLP SLAPMLEKVL PAVVSVRVEG 
TASQGQKIPE EFKKFFGDDL PDQPAQPFEG LGSGVIINAS KGYVLTNNHV INQAQKISIQ
LNDGREFDAK LIGSDDQSDI ALLQIQNPSK LTQIAIADSD KLRVGDFAVA VGNPFGLGQT
ATSGIVSALG RSGLNLEGLE NFIQTDASIN RGNSGGALLN LNGELIGINT AILAPGGGSV
GIGFAIPSNM ARTLAQQLID FGEIKRGLLG IKGTEMSADI AKAFNLDVQR GAFVSEVLPG
SGSAKAGIKA GDIITSLNGK PLNSFAELRS RIATTEPGTK VKLGLLRNGK PLEVEVTLDT
STSSSASAEM ITPALEGATL SDGQLKDGGK GIKIDEVVKG SPAAQAGLQK DDVIIGVNRD
RVNSIAEMRK VLAAKPAIIA LQIVRGNESI YLLMR