Gene EcSMS35_4762 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_4762 
Symbol 
ID6144539 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4860810 
End bp4864046 
Gene Length3237 bp 
Protein Length1078 aa 
Translation table11 
GC content48% 
IMG OID641619575 
Producttype I restriction-modification system endonuclease 
Protein accessionYP_001746682 
Protein GI170682576 
COG category[V] Defense mechanisms 
COG ID[COG0610] Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones15 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones51 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGCTGCAC ATCATGAAGT GCATTTTGAA CAATATATTA CCGAGCAGTT GGTGAACAAT 
GGCTGGAAAG AAGGCCATTC CTGCCACTAT GACCAGCAAC GCGCCCTTTA CCCGCAAGAT
GTCATGGATT GGGTAAAGGC CACCCAGCCA GATGCCTGGC ACAAACTGAA TAAACTGCAT
GGTGTGAATG CCGAGCAGGT GCTGTTGGAG CGTTTGGAAA AAGCCCTCTC TTCCCACAAA
GCTGGCAGTA AAACAGGTGG CACCATTAAT GTGCTGCGTA AAGGCTTTAC CATTGCAGGT
TGCGGCACGA TTGCCATGGG CCAACGCAAA CCCGAAGATG CGCGCAATGA AGAGCGTATA
GCCCGTTATG AGCACAACAT TTTGCGCGTG GTGCGCCAAC TGAAATATTG CCCGACTCGC
GAGTGGGAAA TTGACTTGGT GTTCTTTATC AACGGTCTAC CCGTTGCCAC CGCCGAGCTA
AAAACCGATT TTACCCAGTC GCTAGAATCG GCGATTGCGC AGTACAAAGA GGATCGTCTG
CCCGTTGACC CAAAAACCAA GCGTAAAGAG CCGCTGTTAA CCTTTAAACG CGGTGCAGTG
GTACATTTTG CCATGTCTGA TACAGAAATC GCCATGGCGA CCAAACTGGC GGGAGAAAAT
ACCTATTTTC TGCCCTTCAA TAAAGGCCAT GATGGTTCTA CCTCTGATTT GCCCGTACAC
ATCAACTTGC CGGGTGAAGA TGGAGAGTAT CCGGTCAGCT ATTTTTGGCA AGAAATTGCC
CAAAAAGATA ATTGGCTACG CATTTTCTAC AGCTTTGTGT ATGTCGAAAC CAAAGAGGTT
GCCGATAGCA ATGGCAAGGT TGAGCGTAAG GAAACACAGA TTTTTCCGCG TTATCATCAG
TTAGCCGCAG TCAACAAGAT GATTGATGAT GCCCGCCAGA ACGGCGCAGG CATGCAGTAT
CTTTGTGAGC ACAGCGCCGG TTCGGGAAAA ACCTCTACCA TAGCCTGGAC GGCCCATGAT
TTGATTCGTT TACGCAGCAC CGATGGTAAA GCGGTCTTTA AATCAGTGAT CATTGTGACC
GACCGCACCG TGCTGGATTC GCAGTTGCAA GATGCGGTGC AACAGCTGGA TCATCAATAT
GGGGTGATTA AGGCGATTGA TCGCGAAAAG AGCAGCGAGT CGAAATCAAA ACAGCTGACT
GAAGCGCTAT TGACTGGCAC GCCGATCATT GTAGTGACGA TTCAAACCTT CCCATATGCG
TTAGAAGCGA TTCTGACCAA TCAATCTTTG GCACAAAGTA ACTTTGCGGT GATCATCGAT
GAAGCACACA CCTCGCAAAC AGGCTCTACA GCCAAAGGGT TACGCGCAGC GCTTACTCTA
AACCTTAGCC CTCAAGAGTT GGAGCAAATG AGCATTGAGG ACATTCTTAC TAAGGTGCAA
GAAGCGCGCG CCATGCCCAA GAACGTTTCG CACTTTGCTT TTACGGCAAC ACCCAAGCAC
AGCACCAAGA TGTTGTTTGG CCGCCCGAAA GACCCTAACC AACCAGTGTC AGATGACAAT
CTTCCTGAAT CGTTCCACCT CTACACCCAA CGTCAGGCCA TTGATGAGGG TTTTATTCTT
GATGTGCTGG AGAATTACAC CCACTACGAT ACAGCCTACA AAATTGGTGA GAACAATCTG
GATGAGAAGC GCGTAGACAG CAAACAAGCT CGCCGCGCGT TGGCGCGCTG GATGTCACTG
CACCCAACCA CGGTCAGCCA AAAAGTCGAA TTTATTATTA AGCACTTCAA GGCCAATATC
GCTCACCTAC TGGAAGGTGA AGCCAAAGCC ATGGTGGTCA CATCGGGCAG GCCACAGGCG
GTGAAATACA AGCTTGCTTT TGATAAATAC ATTAAAAAGC ATGGTATTGA AGGTATCCAA
GCCTTGGTTG CCTTCTCGGG CAAAGTCCCC GGCAAAGATT TGGGTGATGA AGACAGCCAA
GACCCGCTAG AAATTGATTT CGATAAAGAG TACACCGAGT ACAACTTAAA CCCAGACACG
CACGGGGCTG ACCTGCGCCA TGAGTTTGAG AAAACCGAAT ATCGCGTCAT GTTAGTCGCC
AACAAGTTCC AAACGGGCTT TAATCAACCA AAACTGGTTG CCATGTATTT GGATAAGAAA
ATATCCGATG TAGAAGCGGT GCAAACGCTG TCTCGCCTAA ACCGCACCTA TCCAGGCAAA
GACACAACCT TCGTGATTGA TTTTGTTAAC GATCCACAAA CCATTCTCAA TGCGTTCAAA
AAGTACGACA AAGGCGCGCA GCTTAATGAA GTGCAAGATG TGAACGTCAT CTATGACATC
AAAGATATTT TGGATGAACA GAACATATAC AACCATCAAG ATTTAGAGTT GTTTAAGCAA
GCGCGTGGTA AATCGATTCT CGGCCAATCA CCTGACAAAA AGTCACACGC GCACAAGAAA
CTGCTAGCTG CCACACAACG CCCTACCGAT GTGTTCAACG TTAAGCTCAA AGAGCTGGTT
GATGCTGCAA ACCATTGGGA TCAGCAATAC AACAAAGCGC ACCTTGCGGG TGATGAAAAA
GCGGCGAACT ACGCAGAATC TCAGCGCAGT GATTTCACCA AGCAGCGCGA AGCACTAATG
CGTTTTAAGT CAGATTTAGC TCGATTTGTG AAGCACTATA GCTATATGGC GCAATTGATT
GAGTTTGGTG ATCCTGAACT GGAAAACTTT GCCGCGTTTG CTCATTTGCT ATCACGTCGA
TTGAAAGGGG TAACACCAGA AAACATCGAT CTCAGTGCAC TGGTACTAGA AAAATTCAAA
ATCAAATACG ACAAAGAGCC GCTTCCAGAA GCGTTTAATC AGGTACTTGA ACCTATCCGC
CCCAACTATA ACGACCCTGC GGATCGTGAA CAGGCATTTC TTGCCGATAT CATTCGCCGC
CTAAACGAAC TGTTTGGCGA TGTCGGCGAC GAACCCGGTC GCCGAAACTT TGCCAACGGT
ACTATTACGC GTGTCACCCA AAATCCAATT GTGGTTGAGC AGATAGAAAA GCACGACAAG
TCGATTGCCC TCAAAGGCGA CTTGCCGCAA GCGGTGAAGC AAGCCGTTGT GCAGGCTCTG
CTGAAAGAAG GTGATATAGC CCGAACGCTA CTCAAAGACC CACAAGTTAT GGCAAGTTAT
GTTGAGTTGA TTTTTGACAT GATGAAGCAA GGTGCAGAAC AGGCGGCGGT GAAATAA
 
Protein sequence
MAAHHEVHFE QYITEQLVNN GWKEGHSCHY DQQRALYPQD VMDWVKATQP DAWHKLNKLH 
GVNAEQVLLE RLEKALSSHK AGSKTGGTIN VLRKGFTIAG CGTIAMGQRK PEDARNEERI
ARYEHNILRV VRQLKYCPTR EWEIDLVFFI NGLPVATAEL KTDFTQSLES AIAQYKEDRL
PVDPKTKRKE PLLTFKRGAV VHFAMSDTEI AMATKLAGEN TYFLPFNKGH DGSTSDLPVH
INLPGEDGEY PVSYFWQEIA QKDNWLRIFY SFVYVETKEV ADSNGKVERK ETQIFPRYHQ
LAAVNKMIDD ARQNGAGMQY LCEHSAGSGK TSTIAWTAHD LIRLRSTDGK AVFKSVIIVT
DRTVLDSQLQ DAVQQLDHQY GVIKAIDREK SSESKSKQLT EALLTGTPII VVTIQTFPYA
LEAILTNQSL AQSNFAVIID EAHTSQTGST AKGLRAALTL NLSPQELEQM SIEDILTKVQ
EARAMPKNVS HFAFTATPKH STKMLFGRPK DPNQPVSDDN LPESFHLYTQ RQAIDEGFIL
DVLENYTHYD TAYKIGENNL DEKRVDSKQA RRALARWMSL HPTTVSQKVE FIIKHFKANI
AHLLEGEAKA MVVTSGRPQA VKYKLAFDKY IKKHGIEGIQ ALVAFSGKVP GKDLGDEDSQ
DPLEIDFDKE YTEYNLNPDT HGADLRHEFE KTEYRVMLVA NKFQTGFNQP KLVAMYLDKK
ISDVEAVQTL SRLNRTYPGK DTTFVIDFVN DPQTILNAFK KYDKGAQLNE VQDVNVIYDI
KDILDEQNIY NHQDLELFKQ ARGKSILGQS PDKKSHAHKK LLAATQRPTD VFNVKLKELV
DAANHWDQQY NKAHLAGDEK AANYAESQRS DFTKQREALM RFKSDLARFV KHYSYMAQLI
EFGDPELENF AAFAHLLSRR LKGVTPENID LSALVLEKFK IKYDKEPLPE AFNQVLEPIR
PNYNDPADRE QAFLADIIRR LNELFGDVGD EPGRRNFANG TITRVTQNPI VVEQIEKHDK
SIALKGDLPQ AVKQAVVQAL LKEGDIARTL LKDPQVMASY VELIFDMMKQ GAEQAAVK