Gene EcolC_0472 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcolC_0472 
Symbol 
ID6068402 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli ATCC 8739 
KingdomBacteria 
Replicon accessionNC_010468 
Strand
Start bp513587 
End bp514954 
Gene Length1368 bp 
Protein Length455 aa 
Translation table11 
GC content52% 
IMG OID641599877 
Productserine endoprotease 
Protein accessionYP_001723476 
Protein GI170018522 
COG category[O] Posttranslational modification, protein turnover, chaperones 
COG ID[COG0265] Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain 
TIGRFAM ID[TIGR02037] periplasmic serine protease, Do/DeqQ family 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.0649674 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones11 
Fosmid unclonability p-value0.0819783 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAAAAAAC AAACCCAGCT GTTGAGTGCA TTAGCGTTAA GTGTCGGGTT AACTCTCTCG 
GCGTCATTTC AGGCCGTCGC GTCGATTCCA GGCCAGGTTG CCGATCAGGC CCCTCTCCCC
AGTCTGGCTC CAATGCTGGA AAAAGTGCTT CCGGCAGTGG TGAGCGTACG GGTGGAAGGA
ACGGCCAGTC AGGGACAGAA AATCCCGGAA GAATTCAAAA AGTTTTTTGG TGATGATTTA
CCGGATCAAC CTGCACAACC CTTCGAAGGT TTAGGCTCCG GTGTCATCAT CAACGCCAGT
AAAGGCTATG TGCTGACCAA CAACCATGTG ATTAATCAGG CACAGAAAAT CAGTATTCAG
CTCAATGATG GGCGCGAGTT TGATGCAAAA CTGATTGGTA GCGATGACCA GAGCGATATC
GCCCTGTTAC AAATTCAAAA CCCGAGCAAA TTAACGCAAA TCGCTATTGC CGACTCCGAT
AAATTGCGCG TCGGTGATTT TGCCGTAGCG GTCGGTAACC CATTTGGCCT TGGGCAAACC
GCCACCTCTG GCATTGTTTC CGCATTAGGC CGCAGCGGGT TGAATCTTGA AGGTCTGGAA
AACTTTATCC AGACAGATGC TTCCATTAAC CGCGGTAACT CCGGCGGTGC ACTATTAAAC
CTTAACGGTG AGTTAATTGG CATCAACACT GCAATCCTTG CGCCTGGCGG CGGGAGCGTC
GGGATTGGAT TTGCCATCCC CAGTAATATG GCGCGAACAC TGGCGCAGCA GCTTATCGAC
TTTGGTGAAA TCAAACGCGG TTTGTTAGGC ATCAAAGGCA CCGAGATGAG TGCCGATATC
GCCAAAGCCT TCAACCTTGA CGTGCAGCGT GGCGCGTTTG TCAGCGAAGT GTTGCCAGGT
TCTGGCTCGG CAAAAGCGGG CGTCAAAGCG GGCGATATTA TTACCAGCCT CAACGGCAAA
CCGCTGAATA GCTTTGCTGA GTTGCGCTCT CGTATCGCGA CCACCGAGCC GGGCACGAAA
GTGAAGCTTG GCCTGCTGCG TAACGGCAAA CCACTGGAAG TAGAAGTGAC GCTCGATACC
AGCACCTCTT CGTCGGCCAG CGCTGAAATG ATCACGCCAG CGCTGGAAGG TGCAACGTTG
AGCGATGGTC AGCTAAAAGA TGGCGGCAAA GGTATTAAAA TCGATGAAGT TGTCAAAGGA
AGCCCAGCTG CTCAGGCTGG CTTGCAAAAA GACGATGTGA TCATTGGCGT CAACCGCGAT
CGGGTGAACT CGATTGCTGA AATGCGTAAA GTGCTGGCGG CAAAACCGGC CATCATCGCC
CTGCAAATTG TACGCGGCAA TGAAAGCATC TATCTGCTGA TGCGTTAA
 
Protein sequence
MKKQTQLLSA LALSVGLTLS ASFQAVASIP GQVADQAPLP SLAPMLEKVL PAVVSVRVEG 
TASQGQKIPE EFKKFFGDDL PDQPAQPFEG LGSGVIINAS KGYVLTNNHV INQAQKISIQ
LNDGREFDAK LIGSDDQSDI ALLQIQNPSK LTQIAIADSD KLRVGDFAVA VGNPFGLGQT
ATSGIVSALG RSGLNLEGLE NFIQTDASIN RGNSGGALLN LNGELIGINT AILAPGGGSV
GIGFAIPSNM ARTLAQQLID FGEIKRGLLG IKGTEMSADI AKAFNLDVQR GAFVSEVLPG
SGSAKAGVKA GDIITSLNGK PLNSFAELRS RIATTEPGTK VKLGLLRNGK PLEVEVTLDT
STSSSASAEM ITPALEGATL SDGQLKDGGK GIKIDEVVKG SPAAQAGLQK DDVIIGVNRD
RVNSIAEMRK VLAAKPAIIA LQIVRGNESI YLLMR