Gene EcSMS35_1343 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_1343 
SymbolprtB 
ID6145994 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp1331254 
End bp1333314 
Gene Length2061 bp 
Protein Length686 aa 
Translation table11 
GC content50% 
IMG OID641616220 
Productprotease 2 
Protein accessionYP_001743400 
Protein GI170682850 
COG category[E] Amino acid transport and metabolism 
COG ID[COG1770] Protease II 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones16 
Plasmid unclonability p-value0.790431 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones40 
Fosmid unclonability p-value0.26509 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCTACCAA AAGCCGCCCG CATTCCCCAC GCCATGACGC TTCATGGCGA TACGCGCATC 
GATAATTACT ACTGGCTGCG GGACGATACG CGTTCTCAGC CGAAAGTCCT GGACTACCTG
CAGCAAGAAA ATAGTTACGG TCATCGGGTG ATGGCCTCGC AACAAGCCTT GCAGGATCGC
ATCTTAAAGG AAATCATCGA CCGCATTCCG CAAAGAGAAG TTTCTGCGCC CTACATAAAA
AATGGCTACC GCTATCGGCA GATTTATGAA CCAGGCTGTG AATATGCTAT CTACCAGCGT
CAATCGGCAT TCAGTGAAGA GTGGGATGAG TGGGAAACAT TGCTCGATGC CAATAAGCGC
GCGGCGCACA GTGAGTTTTA TTCGATGGGC GGAATGGCGA TTACGCCCGA TAACACCATT
ATGGCGCTGG CAGAAGATTT TCTTTCCCGA CGCCAGTACG GCATTCGTTT TCGTAATCTG
GAAACTGGTA ACTGGTACCC GGAACTGCTG GATAACGTTG AACCCAGCTT TGTCTGGGCA
AATGACTCCT GGACTTTCTA CTATGTACGC AAGCATCCAG TTACGCTGCT GCCTTATCAG
GTCTGGCGTC ACGCCATCGG TACGCCAGCA TCGCAAGATA CACTGATCTA CGAAGAAAAA
GACGATACCT ATTACGTCAG CCTGCATAAA ACGACCTCGA AGCACTATGT AGTCATTCAT
CTGGCCAGCG CCACCACCAG TGAAGTTCGC CTGCTGGACG CGGAATTGGC CGATGCCGAG
CCGTTTGTTT TTCTGCCGCG CCGCAAAGAT CACGAATACA GCCTTGATCA CTACCAGCAT
CGTTTTTATC TGCGTTCCAA CCGCCACGGC AAAAACTTTG GCTTATACCG TACCCGTATG
CGTGATGAGC AACAGTGGGA AGAGTTAATT CCGCCACGTG ATAACATTAT GCTGGAAGGG
TTTACGCTGT TTACCGACTG GCTGGTGGTT GAAGAGCGTC AGCGCGGGTT AACCAGTTTG
CGGCAAATTA ACCGCAAGAC CCGGGAAGTC ATTGGCATTG CCTTTGATGA TCCGGCCTAT
GTGACCTGGA TTGCCTACAA TCCAGAACCT GAAACCGCGC GATTGCGTTA TGGTTATTCT
TCCATGACCA CACCAGACAC TCTGTTTGAA CTGGATATGG ATACCGGTGA GCGTCGTGTA
TTAAAACAAA CGGAAGTTCC TGGTTTTGAT GCGGCGAATT ACCGCAGTGA ACACCTGTGG
ATAGTCGCCC GGGATGGCGT CGAAGTTCCG GTTTCGTTGG TCTACCATCG CAAACATTTT
CGCAAAGGAC ACAACCCGCT GCTGGTGTAT GGCTACGGTT CTTACGGCGC AAGTATTGAT
GCCGATTTCA GTTTTAGCCG CTTGAGTTTG TTAGATCGTG GCTTTGTCTA CGCCATTGTC
CATGTTCGCG GCGGTGGTGA GCTGGGGCAA CAATGGTACG AAGACGGAAA ATTCCTGAAG
AAGAAAAATA CGTTTAATGA TTATCTTGAT GCCTGCGATG CATTGTTAAA ACTGGGCTAT
GGATCACCTT CGCTCTGTTA TGCGATGGGC GGGAGTGCCG GGGGCATGTT GATGGGCGTT
GCAATTAATC AACGCCCGGA ATTATTCCAC GGCGTTATCG CCCAGGTACC GTTTGTTGAT
GTTGTAACAA CGATGCTTGA TGAATCAATT CCTCTTACCA CTGGTGAGTT TGAAGAGTGG
GGGAATCCGC AGGATCCGCA ATATTACGAG TACATGAAAA GCTACAGCCC GTATGACAAC
GTCACCGCAC AGGCTTATCC GCATTTACTG GTAACGACCG GTTTACACGA TTCTCAGGTG
CAATATTGGG AACCGGCAAA ATGGGTGGCT AAATTGCGCG AGCTGAAAAC CGATAACCAT
CTTTTATTGC TCTGTACCGA CATGGACTCA GGCCATGGCG GTAAATCTGG CCGCTTTAAA
TCGTACGAAG GCGTAGCGAT GGAATATGCT TTTCTGGTCG CGCTGGCGCA GGGAACATTA
CCCGCTACGC CTGAGGATTA A
 
Protein sequence
MLPKAARIPH AMTLHGDTRI DNYYWLRDDT RSQPKVLDYL QQENSYGHRV MASQQALQDR 
ILKEIIDRIP QREVSAPYIK NGYRYRQIYE PGCEYAIYQR QSAFSEEWDE WETLLDANKR
AAHSEFYSMG GMAITPDNTI MALAEDFLSR RQYGIRFRNL ETGNWYPELL DNVEPSFVWA
NDSWTFYYVR KHPVTLLPYQ VWRHAIGTPA SQDTLIYEEK DDTYYVSLHK TTSKHYVVIH
LASATTSEVR LLDAELADAE PFVFLPRRKD HEYSLDHYQH RFYLRSNRHG KNFGLYRTRM
RDEQQWEELI PPRDNIMLEG FTLFTDWLVV EERQRGLTSL RQINRKTREV IGIAFDDPAY
VTWIAYNPEP ETARLRYGYS SMTTPDTLFE LDMDTGERRV LKQTEVPGFD AANYRSEHLW
IVARDGVEVP VSLVYHRKHF RKGHNPLLVY GYGSYGASID ADFSFSRLSL LDRGFVYAIV
HVRGGGELGQ QWYEDGKFLK KKNTFNDYLD ACDALLKLGY GSPSLCYAMG GSAGGMLMGV
AINQRPELFH GVIAQVPFVD VVTTMLDESI PLTTGEFEEW GNPQDPQYYE YMKSYSPYDN
VTAQAYPHLL VTTGLHDSQV QYWEPAKWVA KLRELKTDNH LLLLCTDMDS GHGGKSGRFK
SYEGVAMEYA FLVALAQGTL PATPED