Gene EcSMS35_4876 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_4876 
Symbol 
ID6142902 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4986675 
End bp4991915 
Gene Length5241 bp 
Protein Length1746 aa 
Translation table11 
GC content49% 
IMG OID641619680 
Productputative invasin 
Protein accessionYP_001746787 
Protein GI170684166 
COG category 
COG ID 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.847065 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones47 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGCTTCAA CAAATGCGCA TCAAATAAAA AATAATGATC AAAACTCTTT ATGTGGTCTG 
GGTGACAAAA TACGCCGCCT GACCGCCGGA GTTTGTTTAT TTACACAAAT TTTTTTCCCC
GTCATGGCGA CGGCACAAAA TGTGGTACAC GCTAAGCCTC AGACGACAGT TTCATCTCCG
TCCCCTCTAA TAGAAAATAA TACTGTGCCT TACACGCTTG GTGCGCTGGA ATCAGCGCAA
AGCGTTGCTG ATCGATTCGG TATTTCACTG GAGGAGCTTC GTCGTCTTAA TCAGTTCCGT
ACTTTTGCTC GCGGCTTTGA TAACGTGCGC CAGGGTGAAG AGCTGGATGT TCCGGCAACA
ACCTTGCAGA AAAGTCATGA GCAACAAAAT GCCGTACCGC CTGCGAATGG CGAAAACACG
CTGGAGAATC AAATAGCCAG CACTTCGCAG CGAGTTGGCC CCCTGCTTTC ACAAGATATG
AATAGTGAGC AGGCCAGCGG CATGGCGCGT GGTTGGGCGT CTTCAGAAGC CTCAGGCGCG
ATGACTGATT GGTTAAATAA CTTTGGCACT GCGAAAATCT CTCTGGGTGT GGATGAAGAT
TTCAGCCTGA AAAATTCGCA ATTCGACTTC CTGCATCCGT GGTATGACAC ACCCGATTAT
CTGCTCTTCA GCCAGCATAC CCTTCACCGA ACAGACGATC GTACCCAGAT CAACACCGGT
TTGGGCTGGC GTCATTTCAC CCCCAGCTGG ATGTCAGGCA TCAACCTGTT TTTTGACCAC
GACCTGAGCC GCTATCACTC CCGCGCGGGG CTTGGCGCAG AATACTGGCG TGATTATCTG
AAGTTGAGCA GCAACGCTTA TATCGGCCTG ACCGGCTGGC GTAGCGCACC GGAATTGGAT
AATGACTACG AAGCCCGCCC GGCCAACGGC TGGGATTTAC GCGCGGAAGG CTGGTTACCA
GCCTGGCCAC AACTGGGTGG AAAACTGGTC TATGAACAAT ACTATGGTGA TGAAGTGGCG
CTGTTTGACA AGAATGATCG CCAAAGTAAT CCCCATGCTA TTACCGCGGG CCTCAACTAT
ACCCCCTTCC CGCTTCTGAC TCTCAGTGCG GAACAGCGTC AGGGGAAACA AGGTGAAAAT
GACACACGTT TTGCCGTTGA TCTGACCTGG CAACCCAGCA GTTCAATGCA GAAACAGCTT
AACCCGGACG AAGTGGCCGG ACGGCGCAGT CTGGCCGGTA GTCGTTATGA CCTGATTGAT
CGCAACAACA ACATCGTTCT GGAATATCGC AAGAAAGAGC TGATTCACCT GAGTCTGCAG
GATCCGGTGA AAGGAAAGTC TGGAGAAATA AAACCGCTGG TTTCCTCGAT ACAGACCAAA
TATGCCCTGA AAGGCTATAA CATCGAAGCC GCTGCGCTGG AAGCTGCCGG AGGTAAAGTC
AGGACGTCTG GAAAAGATAT CACGGTCACG CTGCCAGGTT ACCGCTTCAC TAACACTCCA
GAAACCGATA ATACATGGTC GATAGACGTT ACCGCCGAGG ATGTAAAAGG TAACCTGTCA
CGTCATGAAC AAAGCATGGT CGTCATTCAG GCTCCGACAT TAAGCCAGAA AGATTCTCTG
TTATCCGTTA ATCCGCTAAC CGTAGCTGCA GATAAAAAAT CGACGACCAC ATTGACCGTT
ACTGCACACG ATTCCGACGG AACTCCGGTG CCGGGGCTGG CGCTGCAAAC CCGCAGTGAA
GGCGTTCAGG ATATCACCCT GTCTGACTGG ACAGATAACG GAGATGGTAG TTACACACAG
ATGCTGACCG CCGGAACGAC ATCAGGTTCA GTAACACTGA CGCCGCAAAT TAACGGTGAG
AGTGCGGTAA AAGAATCCAT CGTCGTTAAT ATCGTCCCTG TTGTCTCATC CCGCGACCAT
TCATCAATAA CAATAGATAA CGTATCGTAT TATGCCGGAG ACGACATCAA GGTTAGGGTG
GAACTGAAAG ACGACAGCAA TCAACCGGTT GCTTATCAAA AAGAAGAATT GGTAAAAGCC
GTTACTGTCG AAAACAGCAA ACCTGGCGCC ACGATTGTCT GGCACGAAGA GCAGCCGGGC
GTTTATGCCG CGAATTATCC GGCCCATAAG CAAGGGACAG CACTAAGGGC ACAACTTAGC
CTTCACAACT GGAATGCTCC ACTGCAATCG CATATTTATA ACATTGAGGC AAACCAGAAT
AAGGCTCGCG TCGCCACATT ATCAGCGACA AATAATGACG TTTACGCCGA TAAAAAGACA
TTTAATACCC TCACGATCAA CGTCACTGAT GAGAGTGATA ATCCCCTGAC AAATCATCAG
GTCACCTTTA AGAATGAAAA AGGAAGCGCT GAATTTGTCG AACCGCCGCA GCAAAATACG
GATGCATATG GTGTTGCCAC AATCAACATG GTAAGTCAGG TTGCGGAAGA AAATACGATT
AGTGCCACGC TGCCAAATGG TTTCTCACAA CGGATAATTG CGAAATTCGT TAGCGATTCG
AGTACGCCAA AATTCAAACA ACTGGTTGCC GATCCAGATA CCATTATTGC TGGCAACAGC
CAGGGCAGTA CTCTGACCGC CACCGTCACA GACTTTCATA ACAACCCGTT AAAAGATATG
AAAGTGAATT TTGTGGCACC TGGTGGCTCG CAACTGGACA ACACGACCGC CACAACAGAC
CAGTCCGGTA TTGTGCGAGT GCACCTGACC AGTTCAAAAG CTGGTAGCTA TTCCGTCGAT
GCCTCGCTTG AGGCGGATAA AAATATTCAC CAGTCGGTCA CGATCACCGT GGTCCCAAAC
AGGGAACAAT CGGTAATGAC CTTAAATGCC GGGTCGGGCA GTGCGATCGC TAACAATACA
AATACCGTTA TCCTGACAGC CAGTGTGAAA GATGTTTATG GACACCCGTT GCCGGATGAG
GATGTGAAAT TTACCTTGCC AGCCTCCATG ACCGGGAACT TCACGCTAAG TAGTGAGACC
GCCCGCACCG ATGCAAACGG TGATGCCGTG GTTACATTGC GAGGCACAAA AGCGGGTGAG
TTTACAGTTA CGGCGACGCT GACCAGAAAT AACACCGTTG CTCATCAGCA AGTCACTTTT
ATTGGGGATA CAAACAGTGC GCAGCTCCAG CCGCTGACTG CCTCATTAAA TACCATTGTT
GCGGGTGACA GTACGGGGAG TACCCTGACG GCAACGATCC TGGACGCTTA CCAAAATCCG
CTTAAAGACC AGCTGGTCAC TTTCCAGAGT AACGATGTCA CTCTAAGTGG AACAGAAGTC
ACCACCAATA CGCTGGGTCA GGCGACGGTA ACAATGACCA GCAATATTGC CGGGCAACAT
AACGTCGTGG TGAGCCGGAA AGCGCAAGTC TCCGATAATA AAACGTTTAG TTTATCAGTG
CTGCCGGATG AAAGTTCGGC GAAGGTAATA AGTATAACCG GAGCCGAAAA AACGATAACG
GTTGGCGAAA ACATCACGCT ACGGATACTC GTCCAGGACG CGTTTAACAA TGTAATATCA
GGTCAACGCG TCAGATTAAG TGCGCAGCCA ACAGCTAACA TTACGATAGG CGATACGGCT
TACACCGATA ATAACGGTTA TGCATACGTT AACCTTATTA GCACCCAGCC TGGGGTTTAT
CAGGTGACGG CAACGCTGGA CAATAACAGT AGTAGTAAGG TTGACGTGAA TGTGGCAAAT
GGCAAGCTCG AGTTAACATC ATCGAAGCCA GAAACCACGG TCCATAACAG TGAAGGTATT
ACGCTTACCG CAACGGCGAG AAATGCGCGG GATGAATTGA TGCCTGGGCA AATTATCACC
TTTAGCGTAA CGCCTGAAGG TGCAACGCTA AGCAATACAG GGGAAATCCT TACTGACCAG
TCTGGTCAGG CCAAAGTGAC GCTGACCAGT AACAAAGTGA ATGTCTATAC CGTTACGGCC
ACAATGGGCA AAGATGTTCC CGTTCAGAGC CAGGTAACGG TTGCGGTTAA GGCAGATGCT
AAAACGGCAC ATGTTGTGAG CGTCGTGGCT TCTCCTGACA CCATAACCGC CGACGGCGTT
GATAGCAGCA CCATCACGTC ACGAGTAGAA GATGATTACG GATTCCCGGT TGAAGGTGTC
GATGTTAGTT ATGCCTTAGA CACCAAAGGC AGCCCGGTAG TTAATATTCC AACTACGCGT
ACCGATCAGT CCGGGCAAGT CACGGCGACA ATAACCAGTA CATTGGCAGA AACCTTAATA
GTCAATGTGC AAGTTCCTGG TACAGCCAAC CAATCCGCAA CCATTACATT GGTTGCCGGC
ACGGCCGATG AAAGTAAGTC AATTTTGAAA TCCGACGTTG ACACTCTGAA GGCTGACTAC
CAGCAGAGCG CAAAACTTAC GCTAACATTG CAAGACAAGT ACGGTAACCC GATAGTGACG
TCTGATCATC TGGAATTTGT CCAGTCAGGC CCCTTCGTGA ACTTTCTCAA GTTGAGCGAT
ATTGATTACA GCCAAAGAAA TTATGGCGAG TACACCGTGA CTGTCACTGG CGGAAAAGAG
GGAACAGCGA CTCTCATTCC CATGCTGAAC GGGGTTCATC AGGCAAACTT AAGCGTATCG
CTGAATCTCA TCCGCTCGAT AAAAGAAATG TCCGGTCATG TCACTGCAAA CAACCATACC
TTCTCCACGG CTAAATTCCC GAGCGAAGGT TTTGCAGGAG CGTATTACAC ACTCAACAAT
GATAACTTTG AAGCGGGTAA AACCGTTGAT GATTATATGT TTTCAAGTTC ACAGAGTTGG
GTGTCGGTCG ATGCGTCAGG TAAGGTTTCT TTCGCAAATA TCGGCGATCA AACGTCAGTC
ACAATAAGCG CTGTTCCCCG ACAAGGAGGT ACAACCTACC AGACCTTAAT TAAGCTGAAA
GGCTGGTGGG TGAATAATGG AAATCATACC AATATCTGGC TAGCTGCCAA TGCGCTCTGT
CATGCTAAAA ATGATGGATA TACTCTTCCT GCCATCACAC ATTTGACGTC TGGAGAAAAC
CAACGCACAC AGGGATCGCT ATATGGTGAA TGGGGGAACG TTGGAGCGTT TTCCAGTAAT
TCGCAATTTA CACAAGGTGC TTACTGGACA AGTGAATCTG ATGATTACAA TCGGCACTAC
TATGTGCAGA TGCTAACGGG TATGACCGGA AGCGACGCTG ATTCCAGCCC CCAACTGACC
GCCTGCCGTA AATCACTTTA A
 
Protein sequence
MASTNAHQIK NNDQNSLCGL GDKIRRLTAG VCLFTQIFFP VMATAQNVVH AKPQTTVSSP 
SPLIENNTVP YTLGALESAQ SVADRFGISL EELRRLNQFR TFARGFDNVR QGEELDVPAT
TLQKSHEQQN AVPPANGENT LENQIASTSQ RVGPLLSQDM NSEQASGMAR GWASSEASGA
MTDWLNNFGT AKISLGVDED FSLKNSQFDF LHPWYDTPDY LLFSQHTLHR TDDRTQINTG
LGWRHFTPSW MSGINLFFDH DLSRYHSRAG LGAEYWRDYL KLSSNAYIGL TGWRSAPELD
NDYEARPANG WDLRAEGWLP AWPQLGGKLV YEQYYGDEVA LFDKNDRQSN PHAITAGLNY
TPFPLLTLSA EQRQGKQGEN DTRFAVDLTW QPSSSMQKQL NPDEVAGRRS LAGSRYDLID
RNNNIVLEYR KKELIHLSLQ DPVKGKSGEI KPLVSSIQTK YALKGYNIEA AALEAAGGKV
RTSGKDITVT LPGYRFTNTP ETDNTWSIDV TAEDVKGNLS RHEQSMVVIQ APTLSQKDSL
LSVNPLTVAA DKKSTTTLTV TAHDSDGTPV PGLALQTRSE GVQDITLSDW TDNGDGSYTQ
MLTAGTTSGS VTLTPQINGE SAVKESIVVN IVPVVSSRDH SSITIDNVSY YAGDDIKVRV
ELKDDSNQPV AYQKEELVKA VTVENSKPGA TIVWHEEQPG VYAANYPAHK QGTALRAQLS
LHNWNAPLQS HIYNIEANQN KARVATLSAT NNDVYADKKT FNTLTINVTD ESDNPLTNHQ
VTFKNEKGSA EFVEPPQQNT DAYGVATINM VSQVAEENTI SATLPNGFSQ RIIAKFVSDS
STPKFKQLVA DPDTIIAGNS QGSTLTATVT DFHNNPLKDM KVNFVAPGGS QLDNTTATTD
QSGIVRVHLT SSKAGSYSVD ASLEADKNIH QSVTITVVPN REQSVMTLNA GSGSAIANNT
NTVILTASVK DVYGHPLPDE DVKFTLPASM TGNFTLSSET ARTDANGDAV VTLRGTKAGE
FTVTATLTRN NTVAHQQVTF IGDTNSAQLQ PLTASLNTIV AGDSTGSTLT ATILDAYQNP
LKDQLVTFQS NDVTLSGTEV TTNTLGQATV TMTSNIAGQH NVVVSRKAQV SDNKTFSLSV
LPDESSAKVI SITGAEKTIT VGENITLRIL VQDAFNNVIS GQRVRLSAQP TANITIGDTA
YTDNNGYAYV NLISTQPGVY QVTATLDNNS SSKVDVNVAN GKLELTSSKP ETTVHNSEGI
TLTATARNAR DELMPGQIIT FSVTPEGATL SNTGEILTDQ SGQAKVTLTS NKVNVYTVTA
TMGKDVPVQS QVTVAVKADA KTAHVVSVVA SPDTITADGV DSSTITSRVE DDYGFPVEGV
DVSYALDTKG SPVVNIPTTR TDQSGQVTAT ITSTLAETLI VNVQVPGTAN QSATITLVAG
TADESKSILK SDVDTLKADY QQSAKLTLTL QDKYGNPIVT SDHLEFVQSG PFVNFLKLSD
IDYSQRNYGE YTVTVTGGKE GTATLIPMLN GVHQANLSVS LNLIRSIKEM SGHVTANNHT
FSTAKFPSEG FAGAYYTLNN DNFEAGKTVD DYMFSSSQSW VSVDASGKVS FANIGDQTSV
TISAVPRQGG TTYQTLIKLK GWWVNNGNHT NIWLAANALC HAKNDGYTLP AITHLTSGEN
QRTQGSLYGE WGNVGAFSSN SQFTQGAYWT SESDDYNRHY YVQMLTGMTG SDADSSPQLT
ACRKSL