Gene ECH74115_5286 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagECH74115_5286 
SymbolpepQ 
ID6970191 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli O157:H7 str. EC4115 
KingdomBacteria 
Replicon accessionNC_011353 
Strand
Start bp4930396 
End bp4931727 
Gene Length1332 bp 
Protein Length443 aa 
Translation table11 
GC content53% 
IMG OID643388950 
Productproline dipeptidase 
Protein accessionYP_002273364 
Protein GI209398417 
COG category[E] Amino acid transport and metabolism 
COG ID[COG0006] Xaa-Pro aminopeptidase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.0321906 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones40 
Fosmid unclonability p-value0.0291317 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGAATCAC TGGCCTCGCT CTATAAAAAT CATATAGCTA CATTGCAGGA ACGAACTCGC 
GATGCGCTGA CGCGCTTCAA GCTGGATGCG TTACTTATTC ACTCCGGCGA ACTGTTCAAC
GTTTTTCTCG ACGATCATCC CTATCCGTTT AAAGTGAACC CGCAATTCAA AGCGTGGGTG
CCGGTAACTC AGGTGCCAAA CTGCTGGCTG CTGGTGGATG GCGTGAATAA GCCGAAACTG
TGGTTCTATC TGCCGGTTGA TTACTGGCAC AACGTCGAAC CGCTGCCGAC CTCCTTCTGG
ACTGAAGATG TAGAAGTGAT CGCGCTGCCG AAAGCCGATG GCATTGGTAG TCTGTTGCCT
GCTGCGCGCG GCAATATCGG TTATATCGGT CCGGTGCCGG AACGTGCGCT GCAACTGGGT
ATTGAGGCCA GCAATATCAA CCCGAAAGGG GTTATCGACT ACCTGCATTA CTACCGCTCC
TTCAAAACCG AGTACGAGCT GGCCTGTATG CGTGAAGCGC AGAAAATGGC GGTCAACGGT
CATCGCGCGG CAGAAGAAGC GTTCCGTTCT GGCATGAGCG AGTTCGATAT CAATATTGCC
TATCTGACTG CGACCGGTCA TCGTGATACC GACGTACCTT ACAGCAACAT TGTGGCACTC
AACGAACACG CTGCGGTGCT GCATTACACC AAACTGGATC ATCAGGCGTC GGAAGAGATG
CGCAGCTTCC TGCTGGATGC CGGGGCCGAA TATAACGGCT ATGCCGCTGA CCTGACCCGT
ACCTGGTCGG CAAAAAGTGA CAACGATTAC GCACAGCTGG TGAAAGACGT AAATGATGAA
CAACTGGCGC TGATCGCGAC CATGAAAGCT GGCGTTAGCT ATGTGGATTA CCACATCCAG
TTCCATCAGC GCATCGCCAA ATTGCTGCGT AAACATCAAA TCATCACCGA TATGAGTGAA
GAGGCGATGG TCGAAAACGA TCTTACCGGG CCGTTTATGC CGCATGGTAT CGGCCATCCG
CTGGGCCTGC AGGTGCATGA CGTCGCCGGT TTTATGCAGG ATGATAGCGG TACGCACCTC
GCGGCACCGG CAAAATATCC GTACCTGCGC TGCACCCGTA TTCTCCAGCC GGGCATGGTG
TTAACCATCG AACCGGGTAT CTACTTCATT GAATCGCTGC TGGCACCGTG GCGTGAAGGG
CAGTTCAGCA AGCACTTCAA CTGGCAGAAA ATTGAAGCAC TGAAACCGTT CGGCGGCATT
CGTATCGAAG ACAACGTGGT GATCCACGAA AATAACGTGG AAAACATGAC CCGGGATCTG
AAACTGGCGT GA
 
Protein sequence
MESLASLYKN HIATLQERTR DALTRFKLDA LLIHSGELFN VFLDDHPYPF KVNPQFKAWV 
PVTQVPNCWL LVDGVNKPKL WFYLPVDYWH NVEPLPTSFW TEDVEVIALP KADGIGSLLP
AARGNIGYIG PVPERALQLG IEASNINPKG VIDYLHYYRS FKTEYELACM REAQKMAVNG
HRAAEEAFRS GMSEFDINIA YLTATGHRDT DVPYSNIVAL NEHAAVLHYT KLDHQASEEM
RSFLLDAGAE YNGYAADLTR TWSAKSDNDY AQLVKDVNDE QLALIATMKA GVSYVDYHIQ
FHQRIAKLLR KHQIITDMSE EAMVENDLTG PFMPHGIGHP LGLQVHDVAG FMQDDSGTHL
AAPAKYPYLR CTRILQPGMV LTIEPGIYFI ESLLAPWREG QFSKHFNWQK IEALKPFGGI
RIEDNVVIHE NNVENMTRDL KLA