Gene EcSMS35_1998 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_1998 
SymbolpepT 
ID6144694 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp2019440 
End bp2020708 
Gene Length1269 bp 
Protein Length422 aa 
Translation table11 
GC content50% 
IMG OID641616874 
Productpeptidase T 
Protein accessionYP_001744050 
Protein GI170683561 
COG category[E] Amino acid transport and metabolism 
COG ID[COG2195] Di- and tripeptidases 
TIGRFAM ID[TIGR01882] peptidase T 


Plasmid Coverage information

Num covering plasmid clones26 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones46 
Fosmid unclonability p-value0.668207 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
TTGCTTCTTA ATAATGTTGT CACAAAAAGT GAGGGTGACT ACATGGATAA ACTACTTGAG 
CGATTTTTGA ACTACGTGTC TCTGGATACC CAATCAAAAG CAGGGGTGAG ACAGGTTCCC
AGCACGGAAG GCCAATGGAA GTTATTGCAT CTGCTGAAAG AGCAGCTCGA AGAGATGGGG
CTTATCAATG TGACCTTAAG TGAGAAGGGC ACTTTGATGG CGACGTTACC GGCTAACGTC
CCTGGCGATA TCCCGGCGAT TGGCTTTATT TCTCATGTGG ATACCTCACC GGATTGCAGC
GGCAAAAATG TGAATCCGCA AATTGTTGAA AACTATCGCG GTGGCGATAT TGCGCTGGGT
ATCGGCGATG AAGTTTTATC ACCGGTTATG TTCCCGGTGC TTCATCAGCT ACTGGGTCAG
ACGCTGATTA CCACCGATGG TAAAACCTTG TTAGGTGCCG ATGACAAAGC AGGTATTGCA
GAAATCATGA CCGCGCTGGC GGTATTGCAA CAGAAAAACA TTCCGCATGG TGATATTCGC
GTCGCCTTTA CCCCGGATGA AGAAGTGGGC AAAGGGGCGA AGCATTTTGA TGTTGACGCC
TTCGATGCCC GCTGGGCTTA CACCGTTGAT GGTGGTGGCG TAGGCGAACT GGAGTTTGAA
AACTTCAACG CCGCGTCGGT CAATATCAAA ATTGTCGGTA ACAATGTTCA CCCGGGCACG
GCGAAAGGCG TGATGGTAAA TGCGCTGTCG CTGGCGGCAC GTATTCATGC GGAAGTTCCG
GCGGATGAAA GCCCGGAAAT GACAGAAGGC TATGAAGGTT TCTATCACCT GGCGAGCATG
AAAGGCACCG TTGAACGGGC CGATATGCAC TACATCATCC GTGATTTCGA CCGTAAACAG
TTTGAAGCGC GTAAACGTAA AATGATGGAG ATCGCCAAAA AAGTGGGCAA AGGGTTACAT
CCTGATTGCT ACATTGAACT GGTGATTGAA GACAGTTACT ACAATATGCG CGAGAAAGTG
GTTGAGCATC CGCATATTCT CGATATCGCC CAGCAGGCGA TGCGTGACTG CGATATTGAA
CCGGAACTGA AACCGATCCG CGGCGGTACC GACGGCGCGC AGTTGTCGTT TATGGGATTA
CCGTGCCCGA ACCTGTTCAC TGGCGGTTAC AACTATCATG GTAAGCATGA GTTTGTGACT
CTGGAAGGTA TGGAAAAAGC GGTGCAGGTG ATCGTCCGTA TTGCCGAGTT GACGGCGCAA
CGGAAGTAA
 
Protein sequence
MLLNNVVTKS EGDYMDKLLE RFLNYVSLDT QSKAGVRQVP STEGQWKLLH LLKEQLEEMG 
LINVTLSEKG TLMATLPANV PGDIPAIGFI SHVDTSPDCS GKNVNPQIVE NYRGGDIALG
IGDEVLSPVM FPVLHQLLGQ TLITTDGKTL LGADDKAGIA EIMTALAVLQ QKNIPHGDIR
VAFTPDEEVG KGAKHFDVDA FDARWAYTVD GGGVGELEFE NFNAASVNIK IVGNNVHPGT
AKGVMVNALS LAARIHAEVP ADESPEMTEG YEGFYHLASM KGTVERADMH YIIRDFDRKQ
FEARKRKMME IAKKVGKGLH PDCYIELVIE DSYYNMREKV VEHPHILDIA QQAMRDCDIE
PELKPIRGGT DGAQLSFMGL PCPNLFTGGY NYHGKHEFVT LEGMEKAVQV IVRIAELTAQ
RK