Gene EcE24377A_4366 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcE24377A_4366 
SymbolpepQ 
ID5589989 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli E24377A 
KingdomBacteria 
Replicon accessionNC_009801 
Strand
Start bp4356546 
End bp4357877 
Gene Length1332 bp 
Protein Length443 aa 
Translation table11 
GC content53% 
IMG OID640927982 
Productproline dipeptidase 
Protein accessionYP_001465331 
Protein GI157157716 
COG category[E] Amino acid transport and metabolism 
COG ID[COG0006] Xaa-Pro aminopeptidase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones10 
Plasmid unclonability p-value0.000311805 
Plasmid hitchhikingYes 
Plasmid clonabilityhitchhiker 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGGAATCAC TGGCCTCGCT CTATAAAAAT CATATAGCTA CCTTACAAGA ACGGACTCGC 
GATGCGCTGG CGCGCTTCAA GCTGGATGCG TTACTTATTC ACTCCGGCGA GCTGTTCAAT
GTTTTTCTCG ACGATCATCC CTATCCGTTT AAAGTGAACC CGCAATTCAA AGCGTGGGTG
CCGGTAACTC AGGTGCCAAA CTGCTGGTTG CTGGTGGATG GCGTGAACAA GCCGAAACTG
TGGTTTTATC TGCCGGTTGA TTACTGGCAC AACGTCGAAC CGCTGCCGAC CTCCTTCTGG
ACTGAAGATG TGGAAGTGAT CGCGCTGCCG AAAGCCGATG GCATTGGTAG CCTGCTGCCT
GCTGCGCGCG GCAATATCGG TTATATCGGT CCGGTGCCGG AGCGTGCGCT GCAACTGGGT
ATTGAGGCCA GCAACATCAA CCCGAAAGGG GTTATCGACT ACCTGCATTA CTATCGCTCC
TTCAAAACCG AGTACGAACT GGCCTGTATG CGTGAAGCGC AGAAAATGGC GGTCAACGGT
CATCGTGCGG CAGAAGAAGC GTTCCGTTCT GGCATGAGCG AGTTCGATAT CAACATTGCC
TATCTGACCG CGACCGGTCA TCGTGATACC GACGTACCTT ACAGCAACAT TGTGGCGCTT
AACGAACACG CTGCGGTGCT GCATTACACC AAACTGGATC ATCAGGCGCC GGAAGAGATG
CGCAGCTTCC TGCTGGATGC CGGGGCCGAA TATAACGGCT ATGCGGCTGA CCTGACTCGT
ACCTGGTCGG CAAAAAGCGA CAACGACTAC GCACAGCTGG TGAAAGACGT AAATGATGAA
CAACTGGCGC TGATCGCCAC CATGAAAGCA GGCGTTAGCT ATGTGGATTA CCACATCCAG
TTCCATCAGC GCATCGCCAA ACTGCTGCGT AAACATCAAA TCATCACCGA TATGAGTGAA
GAAGCGATGG TCGAAAACGA TCTCACCGGA CCGTTTATGC CGCACGGTAT CGGTCATCCG
CTGGGCCTGC AGGTGCATGA CGTAGCCGGT TTTATGCAGG ATGATAGCGG TACACACCTC
GCGGCACCGG CAAAATATCC GTACCTGCGC TGCACCCGTA TTCTCCAGCC GGGCATGGTG
TTAACCATCG AACCGGGTAT CTACTTCATT GAATCGCTAC TGGCACCGTG GCGTGAAGGG
CAGTTCAGCA AGCACTTCAA CTGGCAGAAA ATTGAAGCAC TGAAACCGTT CGGCGGCATT
CGTATCGAAG ACAACGTGGT GATCCACGAA AACAACGTGG AAAACATGAC CCGGGATCTG
AAACTGGCGT GA
 
Protein sequence
MESLASLYKN HIATLQERTR DALARFKLDA LLIHSGELFN VFLDDHPYPF KVNPQFKAWV 
PVTQVPNCWL LVDGVNKPKL WFYLPVDYWH NVEPLPTSFW TEDVEVIALP KADGIGSLLP
AARGNIGYIG PVPERALQLG IEASNINPKG VIDYLHYYRS FKTEYELACM REAQKMAVNG
HRAAEEAFRS GMSEFDINIA YLTATGHRDT DVPYSNIVAL NEHAAVLHYT KLDHQAPEEM
RSFLLDAGAE YNGYAADLTR TWSAKSDNDY AQLVKDVNDE QLALIATMKA GVSYVDYHIQ
FHQRIAKLLR KHQIITDMSE EAMVENDLTG PFMPHGIGHP LGLQVHDVAG FMQDDSGTHL
AAPAKYPYLR CTRILQPGMV LTIEPGIYFI ESLLAPWREG QFSKHFNWQK IEALKPFGGI
RIEDNVVIHE NNVENMTRDL KLA