Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | EcSMS35_4876 |
Symbol | |
ID | 6142902 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Escherichia coli SMS-3-5 |
Kingdom | Bacteria |
Replicon accession | NC_010498 |
Strand | - |
Start bp | 4986675 |
End bp | 4991915 |
Gene Length | 5241 bp |
Protein Length | 1746 aa |
Translation table | 11 |
GC content | 49% |
IMG OID | 641619680 |
Product | putative invasin |
Protein accession | YP_001746787 |
Protein GI | 170684166 |
COG category | |
COG ID | |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 8 |
Plasmid unclonability p-value | 0.847065 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | 47 |
Fosmid unclonability p-value | 1 |
Fosmid Hitchhiker | No |
Fosmid clonability | normal |
| |
Sequence |
Gene sequence | ATGGCTTCAA CAAATGCGCA TCAAATAAAA AATAATGATC AAAACTCTTT ATGTGGTCTG GGTGACAAAA TACGCCGCCT GACCGCCGGA GTTTGTTTAT TTACACAAAT TTTTTTCCCC GTCATGGCGA CGGCACAAAA TGTGGTACAC GCTAAGCCTC AGACGACAGT TTCATCTCCG TCCCCTCTAA TAGAAAATAA TACTGTGCCT TACACGCTTG GTGCGCTGGA ATCAGCGCAA AGCGTTGCTG ATCGATTCGG TATTTCACTG GAGGAGCTTC GTCGTCTTAA TCAGTTCCGT ACTTTTGCTC GCGGCTTTGA TAACGTGCGC CAGGGTGAAG AGCTGGATGT TCCGGCAACA ACCTTGCAGA AAAGTCATGA GCAACAAAAT GCCGTACCGC CTGCGAATGG CGAAAACACG CTGGAGAATC AAATAGCCAG CACTTCGCAG CGAGTTGGCC CCCTGCTTTC ACAAGATATG AATAGTGAGC AGGCCAGCGG CATGGCGCGT GGTTGGGCGT CTTCAGAAGC CTCAGGCGCG ATGACTGATT GGTTAAATAA CTTTGGCACT GCGAAAATCT CTCTGGGTGT GGATGAAGAT TTCAGCCTGA AAAATTCGCA ATTCGACTTC CTGCATCCGT GGTATGACAC ACCCGATTAT CTGCTCTTCA GCCAGCATAC CCTTCACCGA ACAGACGATC GTACCCAGAT CAACACCGGT TTGGGCTGGC GTCATTTCAC CCCCAGCTGG ATGTCAGGCA TCAACCTGTT TTTTGACCAC GACCTGAGCC GCTATCACTC CCGCGCGGGG CTTGGCGCAG AATACTGGCG TGATTATCTG AAGTTGAGCA GCAACGCTTA TATCGGCCTG ACCGGCTGGC GTAGCGCACC GGAATTGGAT AATGACTACG AAGCCCGCCC GGCCAACGGC TGGGATTTAC GCGCGGAAGG CTGGTTACCA GCCTGGCCAC AACTGGGTGG AAAACTGGTC TATGAACAAT ACTATGGTGA TGAAGTGGCG CTGTTTGACA AGAATGATCG CCAAAGTAAT CCCCATGCTA TTACCGCGGG CCTCAACTAT ACCCCCTTCC CGCTTCTGAC TCTCAGTGCG GAACAGCGTC AGGGGAAACA AGGTGAAAAT GACACACGTT TTGCCGTTGA TCTGACCTGG CAACCCAGCA GTTCAATGCA GAAACAGCTT AACCCGGACG AAGTGGCCGG ACGGCGCAGT CTGGCCGGTA GTCGTTATGA CCTGATTGAT CGCAACAACA ACATCGTTCT GGAATATCGC AAGAAAGAGC TGATTCACCT GAGTCTGCAG GATCCGGTGA AAGGAAAGTC TGGAGAAATA AAACCGCTGG TTTCCTCGAT ACAGACCAAA TATGCCCTGA AAGGCTATAA CATCGAAGCC GCTGCGCTGG AAGCTGCCGG AGGTAAAGTC AGGACGTCTG GAAAAGATAT CACGGTCACG CTGCCAGGTT ACCGCTTCAC TAACACTCCA GAAACCGATA ATACATGGTC GATAGACGTT ACCGCCGAGG ATGTAAAAGG TAACCTGTCA CGTCATGAAC AAAGCATGGT CGTCATTCAG GCTCCGACAT TAAGCCAGAA AGATTCTCTG TTATCCGTTA ATCCGCTAAC CGTAGCTGCA GATAAAAAAT CGACGACCAC ATTGACCGTT ACTGCACACG ATTCCGACGG AACTCCGGTG CCGGGGCTGG CGCTGCAAAC CCGCAGTGAA GGCGTTCAGG ATATCACCCT GTCTGACTGG ACAGATAACG GAGATGGTAG TTACACACAG ATGCTGACCG CCGGAACGAC ATCAGGTTCA GTAACACTGA CGCCGCAAAT TAACGGTGAG AGTGCGGTAA AAGAATCCAT CGTCGTTAAT ATCGTCCCTG TTGTCTCATC CCGCGACCAT TCATCAATAA CAATAGATAA CGTATCGTAT TATGCCGGAG ACGACATCAA GGTTAGGGTG GAACTGAAAG ACGACAGCAA TCAACCGGTT GCTTATCAAA AAGAAGAATT GGTAAAAGCC GTTACTGTCG AAAACAGCAA ACCTGGCGCC ACGATTGTCT GGCACGAAGA GCAGCCGGGC GTTTATGCCG CGAATTATCC GGCCCATAAG CAAGGGACAG CACTAAGGGC ACAACTTAGC CTTCACAACT GGAATGCTCC ACTGCAATCG CATATTTATA ACATTGAGGC AAACCAGAAT AAGGCTCGCG TCGCCACATT ATCAGCGACA AATAATGACG TTTACGCCGA TAAAAAGACA TTTAATACCC TCACGATCAA CGTCACTGAT GAGAGTGATA ATCCCCTGAC AAATCATCAG GTCACCTTTA AGAATGAAAA AGGAAGCGCT GAATTTGTCG AACCGCCGCA GCAAAATACG GATGCATATG GTGTTGCCAC AATCAACATG GTAAGTCAGG TTGCGGAAGA AAATACGATT AGTGCCACGC TGCCAAATGG TTTCTCACAA CGGATAATTG CGAAATTCGT TAGCGATTCG AGTACGCCAA AATTCAAACA ACTGGTTGCC GATCCAGATA CCATTATTGC TGGCAACAGC CAGGGCAGTA CTCTGACCGC CACCGTCACA GACTTTCATA ACAACCCGTT AAAAGATATG AAAGTGAATT TTGTGGCACC TGGTGGCTCG CAACTGGACA ACACGACCGC CACAACAGAC CAGTCCGGTA TTGTGCGAGT GCACCTGACC AGTTCAAAAG CTGGTAGCTA TTCCGTCGAT GCCTCGCTTG AGGCGGATAA AAATATTCAC CAGTCGGTCA CGATCACCGT GGTCCCAAAC AGGGAACAAT CGGTAATGAC CTTAAATGCC GGGTCGGGCA GTGCGATCGC TAACAATACA AATACCGTTA TCCTGACAGC CAGTGTGAAA GATGTTTATG GACACCCGTT GCCGGATGAG GATGTGAAAT TTACCTTGCC AGCCTCCATG ACCGGGAACT TCACGCTAAG TAGTGAGACC GCCCGCACCG ATGCAAACGG TGATGCCGTG GTTACATTGC GAGGCACAAA AGCGGGTGAG TTTACAGTTA CGGCGACGCT GACCAGAAAT AACACCGTTG CTCATCAGCA AGTCACTTTT ATTGGGGATA CAAACAGTGC GCAGCTCCAG CCGCTGACTG CCTCATTAAA TACCATTGTT GCGGGTGACA GTACGGGGAG TACCCTGACG GCAACGATCC TGGACGCTTA CCAAAATCCG CTTAAAGACC AGCTGGTCAC TTTCCAGAGT AACGATGTCA CTCTAAGTGG AACAGAAGTC ACCACCAATA CGCTGGGTCA GGCGACGGTA ACAATGACCA GCAATATTGC CGGGCAACAT AACGTCGTGG TGAGCCGGAA AGCGCAAGTC TCCGATAATA AAACGTTTAG TTTATCAGTG CTGCCGGATG AAAGTTCGGC GAAGGTAATA AGTATAACCG GAGCCGAAAA AACGATAACG GTTGGCGAAA ACATCACGCT ACGGATACTC GTCCAGGACG CGTTTAACAA TGTAATATCA GGTCAACGCG TCAGATTAAG TGCGCAGCCA ACAGCTAACA TTACGATAGG CGATACGGCT TACACCGATA ATAACGGTTA TGCATACGTT AACCTTATTA GCACCCAGCC TGGGGTTTAT CAGGTGACGG CAACGCTGGA CAATAACAGT AGTAGTAAGG TTGACGTGAA TGTGGCAAAT GGCAAGCTCG AGTTAACATC ATCGAAGCCA GAAACCACGG TCCATAACAG TGAAGGTATT ACGCTTACCG CAACGGCGAG AAATGCGCGG GATGAATTGA TGCCTGGGCA AATTATCACC TTTAGCGTAA CGCCTGAAGG TGCAACGCTA AGCAATACAG GGGAAATCCT TACTGACCAG TCTGGTCAGG CCAAAGTGAC GCTGACCAGT AACAAAGTGA ATGTCTATAC CGTTACGGCC ACAATGGGCA AAGATGTTCC CGTTCAGAGC CAGGTAACGG TTGCGGTTAA GGCAGATGCT AAAACGGCAC ATGTTGTGAG CGTCGTGGCT TCTCCTGACA CCATAACCGC CGACGGCGTT GATAGCAGCA CCATCACGTC ACGAGTAGAA GATGATTACG GATTCCCGGT TGAAGGTGTC GATGTTAGTT ATGCCTTAGA CACCAAAGGC AGCCCGGTAG TTAATATTCC AACTACGCGT ACCGATCAGT CCGGGCAAGT CACGGCGACA ATAACCAGTA CATTGGCAGA AACCTTAATA GTCAATGTGC AAGTTCCTGG TACAGCCAAC CAATCCGCAA CCATTACATT GGTTGCCGGC ACGGCCGATG AAAGTAAGTC AATTTTGAAA TCCGACGTTG ACACTCTGAA GGCTGACTAC CAGCAGAGCG CAAAACTTAC GCTAACATTG CAAGACAAGT ACGGTAACCC GATAGTGACG TCTGATCATC TGGAATTTGT CCAGTCAGGC CCCTTCGTGA ACTTTCTCAA GTTGAGCGAT ATTGATTACA GCCAAAGAAA TTATGGCGAG TACACCGTGA CTGTCACTGG CGGAAAAGAG GGAACAGCGA CTCTCATTCC CATGCTGAAC GGGGTTCATC AGGCAAACTT AAGCGTATCG CTGAATCTCA TCCGCTCGAT AAAAGAAATG TCCGGTCATG TCACTGCAAA CAACCATACC TTCTCCACGG CTAAATTCCC GAGCGAAGGT TTTGCAGGAG CGTATTACAC ACTCAACAAT GATAACTTTG AAGCGGGTAA AACCGTTGAT GATTATATGT TTTCAAGTTC ACAGAGTTGG GTGTCGGTCG ATGCGTCAGG TAAGGTTTCT TTCGCAAATA TCGGCGATCA AACGTCAGTC ACAATAAGCG CTGTTCCCCG ACAAGGAGGT ACAACCTACC AGACCTTAAT TAAGCTGAAA GGCTGGTGGG TGAATAATGG AAATCATACC AATATCTGGC TAGCTGCCAA TGCGCTCTGT CATGCTAAAA ATGATGGATA TACTCTTCCT GCCATCACAC ATTTGACGTC TGGAGAAAAC CAACGCACAC AGGGATCGCT ATATGGTGAA TGGGGGAACG TTGGAGCGTT TTCCAGTAAT TCGCAATTTA CACAAGGTGC TTACTGGACA AGTGAATCTG ATGATTACAA TCGGCACTAC TATGTGCAGA TGCTAACGGG TATGACCGGA AGCGACGCTG ATTCCAGCCC CCAACTGACC GCCTGCCGTA AATCACTTTA A
|
Protein sequence | MASTNAHQIK NNDQNSLCGL GDKIRRLTAG VCLFTQIFFP VMATAQNVVH AKPQTTVSSP SPLIENNTVP YTLGALESAQ SVADRFGISL EELRRLNQFR TFARGFDNVR QGEELDVPAT TLQKSHEQQN AVPPANGENT LENQIASTSQ RVGPLLSQDM NSEQASGMAR GWASSEASGA MTDWLNNFGT AKISLGVDED FSLKNSQFDF LHPWYDTPDY LLFSQHTLHR TDDRTQINTG LGWRHFTPSW MSGINLFFDH DLSRYHSRAG LGAEYWRDYL KLSSNAYIGL TGWRSAPELD NDYEARPANG WDLRAEGWLP AWPQLGGKLV YEQYYGDEVA LFDKNDRQSN PHAITAGLNY TPFPLLTLSA EQRQGKQGEN DTRFAVDLTW QPSSSMQKQL NPDEVAGRRS LAGSRYDLID RNNNIVLEYR KKELIHLSLQ DPVKGKSGEI KPLVSSIQTK YALKGYNIEA AALEAAGGKV RTSGKDITVT LPGYRFTNTP ETDNTWSIDV TAEDVKGNLS RHEQSMVVIQ APTLSQKDSL LSVNPLTVAA DKKSTTTLTV TAHDSDGTPV PGLALQTRSE GVQDITLSDW TDNGDGSYTQ MLTAGTTSGS VTLTPQINGE SAVKESIVVN IVPVVSSRDH SSITIDNVSY YAGDDIKVRV ELKDDSNQPV AYQKEELVKA VTVENSKPGA TIVWHEEQPG VYAANYPAHK QGTALRAQLS LHNWNAPLQS HIYNIEANQN KARVATLSAT NNDVYADKKT FNTLTINVTD ESDNPLTNHQ VTFKNEKGSA EFVEPPQQNT DAYGVATINM VSQVAEENTI SATLPNGFSQ RIIAKFVSDS STPKFKQLVA DPDTIIAGNS QGSTLTATVT DFHNNPLKDM KVNFVAPGGS QLDNTTATTD QSGIVRVHLT SSKAGSYSVD ASLEADKNIH QSVTITVVPN REQSVMTLNA GSGSAIANNT NTVILTASVK DVYGHPLPDE DVKFTLPASM TGNFTLSSET ARTDANGDAV VTLRGTKAGE FTVTATLTRN NTVAHQQVTF IGDTNSAQLQ PLTASLNTIV AGDSTGSTLT ATILDAYQNP LKDQLVTFQS NDVTLSGTEV TTNTLGQATV TMTSNIAGQH NVVVSRKAQV SDNKTFSLSV LPDESSAKVI SITGAEKTIT VGENITLRIL VQDAFNNVIS GQRVRLSAQP TANITIGDTA YTDNNGYAYV NLISTQPGVY QVTATLDNNS SSKVDVNVAN GKLELTSSKP ETTVHNSEGI TLTATARNAR DELMPGQIIT FSVTPEGATL SNTGEILTDQ SGQAKVTLTS NKVNVYTVTA TMGKDVPVQS QVTVAVKADA KTAHVVSVVA SPDTITADGV DSSTITSRVE DDYGFPVEGV DVSYALDTKG SPVVNIPTTR TDQSGQVTAT ITSTLAETLI VNVQVPGTAN QSATITLVAG TADESKSILK SDVDTLKADY QQSAKLTLTL QDKYGNPIVT SDHLEFVQSG PFVNFLKLSD IDYSQRNYGE YTVTVTGGKE GTATLIPMLN GVHQANLSVS LNLIRSIKEM SGHVTANNHT FSTAKFPSEG FAGAYYTLNN DNFEAGKTVD DYMFSSSQSW VSVDASGKVS FANIGDQTSV TISAVPRQGG TTYQTLIKLK GWWVNNGNHT NIWLAANALC HAKNDGYTLP AITHLTSGEN QRTQGSLYGE WGNVGAFSSN SQFTQGAYWT SESDDYNRHY YVQMLTGMTG SDADSSPQLT ACRKSL
|
| |