Gene EcSMS35_4228 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_4228 
SymbolpepQ 
ID6143415 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4326207 
End bp4327538 
Gene Length1332 bp 
Protein Length443 aa 
Translation table11 
GC content54% 
IMG OID641619051 
Productproline dipeptidase 
Protein accessionYP_001746179 
Protein GI170683404 
COG category[E] Amino acid transport and metabolism 
COG ID[COG0006] Xaa-Pro aminopeptidase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.000144079 
Plasmid hitchhikingYes 
Plasmid clonabilityhitchhiker 
 

Fosmid Coverage information

Num covering fosmid clones32 
Fosmid unclonability p-value0.0133545 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGAATCAC TGGCCTCGCT CTATAAAAAT CATATAGCTA CCTTACAGGA ACGGACTCGC 
GATGCGCTGG CGCGCTTCAA GCTGGATGCG TTACTTATTC ACTCCGGCGA ACTGTTCAAC
GTTTTTCTCG ACGATCATCC CTATCCGTTT AAAGTGAACC CGCAATTCAA AGCGTGGGTG
CCGGTAACTC AGGTGCCAAA CTGCTGGCTG CTGGTGGATG GCGTGAACAA GCCGAAACTG
TGGTTTTATC TGCCGGTTGA TTACTGGCAC AACGTCGAAC CGCTGCCGAA CTCCTTCTGG
ACTGAAGATG TGGAAGTGAT CGCGCTGCCG AAAGCCGATG GCATTGGTAG CCTGCTGCCT
GCTGCGCGCG GCAATATCGG TTATATCGGT CCGGTGCCGG AACGTGCGCT GCAACTGGGT
ATTGAGGCCA GCAACATCAA TCCGAAAGGG GTGATCGACT ACCTGCATTA CTACCGCTCC
TTCAAAACCG AGTACGAACT GGCCTGTATG CGTGAAGCGC AGAAAATGGC GGTCAACGGT
CATCGCGCGG CAGAAGAAGC GTTCCGTTCT GGCATGAGCG AGTTTGATAT CAATATTGCC
TATCTGACCG CGACCGGTCA TCGTGATACC GACGTACCTT ACAGCAACAT TGTGGCGCTT
AACGAACACG CTGCGGTGCT GCATTACACC AAATTGGACC ACCAGGCACC GGAAGAGATG
CGCAGCTTCC TGCTGGATGC CGGGGCCGAA TATAACGGCT ATGCGGCTGA CCTGACCCGT
ACCTGGTCGG CAAAAAGCGA CAACGACTAC GCACAGCTGG TGAAAGACGT AAATGATGAA
CAACTTGCGC TGATCGCCAC CATGAAAGCT GGCGTCAGCT ATGTGGATTA CCACATCCAG
TTCCATCAGC GCATCGCCAA ATTGCTGCGT AAACATCAAA TCATCACCGA TATGAGTGAA
GAAGCGATGG TCGAAAACGA TCTCACCGGA CCGTTTATGC CGCACGGTAT CGGTCATCCG
CTGGGCCTGC AGGTGCATGA CGTAGCTGGC TTTATGCAAG ATGATAGCGG TACGCACCTC
GCGGCACCGG CAAAATATCC GTACCTGCGC TGCACCCGTA TTCTCCAGCC AGGCATGGTG
TTAACCATCG AACCGGGTAT CTACTTCATC GAATCGCTGC TGGCGCCGTG GCGTGAAGGG
CAGTTCAGCA AACACTTCAA CTGGCAGAAA ATTGAAGCAC TGAAACCGTT CGGCGGCATT
CGTATCGAAG ACAACGTGGT GATCCACGAA AACAACGTGG AAAACATGAC CCGGGATCTG
AAACTGGCGT GA
 
Protein sequence
MESLASLYKN HIATLQERTR DALARFKLDA LLIHSGELFN VFLDDHPYPF KVNPQFKAWV 
PVTQVPNCWL LVDGVNKPKL WFYLPVDYWH NVEPLPNSFW TEDVEVIALP KADGIGSLLP
AARGNIGYIG PVPERALQLG IEASNINPKG VIDYLHYYRS FKTEYELACM REAQKMAVNG
HRAAEEAFRS GMSEFDINIA YLTATGHRDT DVPYSNIVAL NEHAAVLHYT KLDHQAPEEM
RSFLLDAGAE YNGYAADLTR TWSAKSDNDY AQLVKDVNDE QLALIATMKA GVSYVDYHIQ
FHQRIAKLLR KHQIITDMSE EAMVENDLTG PFMPHGIGHP LGLQVHDVAG FMQDDSGTHL
AAPAKYPYLR CTRILQPGMV LTIEPGIYFI ESLLAPWREG QFSKHFNWQK IEALKPFGGI
RIEDNVVIHE NNVENMTRDL KLA