Gene EcSMS35_0331 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_0331 
Symbol 
ID6146291 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp337036 
End bp341289 
Gene Length4254 bp 
Protein Length1417 aa 
Translation table11 
GC content52% 
IMG OID641615227 
Productputative invasin 
Protein accessionYP_001742435 
Protein GI170680117 
COG category 
COG ID 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones22 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones47 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGTCACGTT ATAAAACAGG TCATAAACAA CCACAATTTC GTTATTCAGT TCTGGCCCGC 
TGCGTGGCGT GGGCAAATAT CTCTGTTCAG GTTCTTTTTC CACTCGCTGT CACCTTTACC
CCAGTAATGG CGGCACGTGC GCAGCATGCG GTTCAGCCAC GGTTGAGCAT GGGAAATACT
ACGGTAACTG CTGATAGTAA CGTGGAGAAA AATGTCGCGT CGTTTGCCGC AAATGCCGGG
ACATTTTTAA GCAGTCAGCC AGATAGCGAT GCGACACGTA ATTTTATTAC CGGAATGGCC
ACCGCTAAAG CTAACCAGGA AATACAGGAG TGGCTCGGGA AATATGGTAC AGCGCGCGTC
AAACTGAATG TCGATAAAGA TTTCTCGCTG AAGGATTCTT CGCTGGAAAT GCTTTATCCG
ATTTATGATA CGCCGACAAA TATGTTGTTC ACTCAGGGGG CAATACATCG TACAGACGAT
CGTACTCAGT CAAATATTGG TTTTGGCTGG CGTCATTTTT CAGGAAATGA CTGGATGGCG
GGGGTGAATA CTTTTATCGA TCATGATTTA TCCCGTAGTC ATACCCGCAT TGGTGTTGGT
GCGGAATACT GGCGCGATTA TTTGAAACTG AGCGCCAATG GTTATATTCG GGCTTCTGGC
TGGAAAAAAT CGCCGGATAT TGAGGATTAT CAGGAACGCC CGGCGAATGG TTGGGATATC
CGCGCAGAGG GCTATTTACC TGCCTGGCCG CAGCTTGGCG CAAGCCTGAT GTATGAACAG
TATTATGGCG ATGAAGTCGG GCTGTTTGGT AAAGATAAAC GCCAGAAAGA CCCGCATGCT
ATTTCTGCCG AGGTAACCTA TACACCAGTG CCTCTTCTGA CACTGAGCGC CGGGCATAAG
CAGGGCAAGA GCGGTGAGAA TGACACTCGC TTTGGCCTGG AAGTTAACTA CCGAATTGGC
GAACCTTTGG CGAAACAACT CGATACGGAT AGCATTCGCG AGCGTCGAAT GCTGGCAGGC
AGCCGCTATG ACCTGGTTGA GCGTAATAAC AACATCGTTC TTGAGTATCG CAAATCTGAA
GTGATCCGTA TTGCTCTGCC TGACCGTATT GCAGGTAAGG GCGGGCAGAC GGTTTCCCTG
GGACTTGTGG TGAGTAAAGC AACTCACGGT CTGAAAAATG TTCAATGGGA AGCGCCGTCT
TTGCTGGCCG CAGGCGGAAA AATTACGGGG CAGGGTAATC AGTGGCAAGT GACGCTCCCG
GCTTACCAGG CAGGCAAAGA CAATTATTAT GCGATTTCTG CGGTTGCCTA CGATAACAAA
GGCAATGCCT CAAAACGCGT GCAGACAGAG GTGGTCATTA CCGGAGCAGG TATGAGTGCC
GAGCGCACGG CGTTAACGCT TGACGGTCAG AGCCGTATTC AAATGCTTGC TAACGGTAGT
GAGCAAAAGC CACTGGTGCT TTCTCTGCGA GACGCCGAGG GGCAGCCAGT CACGGGCATG
AAAGATCAGA TCAAGACTGA GCTGACGTTC AAGCCAGCTG GAAATATTGT GACTCGTACC
CTGAAGGCCA CTAAATCACA GGCACAGCCA ACACTGGGTG AGTTCACCGA AACTGAAGCA
GGGGTGTATC AGTCTGTCTT TACTACCGGA ACGCAGTCAG GTGAGGCAAC GATTACTGTT
AGCGTTGATG ACATGAGCAA AACTGTCACT GCAGAACTGC GGGCCACGAT GATGGATGTG
GCAAACTCCA CCCTGAGCGC TAACGAGCCG TCAGGTGATG TGGTTGCTGA TGGTCAGCAA
TCCCACACGC TGACGCTTAC TGCGGTGGAT ACTGATGGTA ACCCGGTGAC GGGAGAAGCC
AGCCGCCTGC GACTTGTTCC GCAAGACACT AATGGTGTAA CCGTTGGTGC CATTTCGGAA
ATAAAACCAG GGGTTTACAG CGCCACGGTT TCTTCGACCC GTGCCGGAAA CGTTGTTGTG
CGTGCCTTCA GCGAGCAGTA TCAGCTGGGC ACATTACAAC AAACGCTGAA GTTTGTTGCC
GGGCCGCTTG ATGCAGCACA TTCGTCCATC ACCCTGAATC CTGATAAACC GGTGGTTGGC
GGTACAGTTA CGGCAATCTG GACGGCAAAA GATGCCTATG ACAACCCTGT GACCAGCCTC
ACGCCGGAAG CGCCGTCATT AGCGGGTGCT GCGGCTGTAG GTTCTACGGC ATCTGGCTGG
ACAAATAATG GTGATGGAAC GTGGACTGCG CAGATTACTC TCGGCTCTAC GGCGGGTGAA
TTAGACGTTA TGCCGAAGCT AAATGGACAG GATGCGGCAG CAAATGCGGC AAAAGTAACC
GTGGTGGCTG ATGCATTATC TTCAAACCAG TCGAAAGTCT CTGTCGCAGA AGATCACGTA
AAAGCCGGTG AAAGCACAAC CGTAACGCTT ATTGCAAAAG ATGCACATGG CAACGCTATC
AGTGGTCTTT CCCTGTCGGC AAGCCTGACG GGTGCTGCGT CTGAAGGGGC GACTGTTTCC
AGTTGGACCG AAAAAGGTGA TGGTTCCTAT GTCGCTACGC TGACAACAGG TGGAAAGACG
GGTGAGCTTC TCGTCATGCC GCTATTCAAC GGCCAGCCAG CAGCCACCGA AGCCGCGCAG
CTGACTGTTA TTGCCGGAGA GATGTCATCA GCGAACTCTA CGCTTGTTGC GGACAATAAG
GCTCCGACCG TCAAAGCGAC GACGGAACTT ACCTTCACCG CGAAGGATGC GTATGGGAAC
CCTGTTAGTG GCCTGAAGCT CGATGCACCA GTGTTTAGCG GTGCCGCCAG CACGGGATCA
GAGCGACCTT CAGCAGGAAG CTGGACAGAG CAAAGTAATG GGGTCTACGT GGCGACCTTA
ACGCTGGGAT CGGCTGCAGG CCAGCTATCA GTCATGCCGC GCGTGAACGG CCAAAATGCC
GTTGCTCAAC CACTGGTGCT GAATGTTGCT GGTGACGCAT CTAAGGCTGA GATTCGTGAT
ATGACAGTGA AGGTTGATAA CCAGCTGGCA AATGGACAGT CGACTAACCA GGTAACCCTG
ACCGTTGTGG ACACCTATGG TAACCCGTTG CAGGGGCAAG AAGTTACGCT GACTTTACCG
CAGGGTGTGA CCAGCAAGAC GGGGAATACA GTAACAACCA ATGCGGCAGG TAAAGCGGAT
ATTGAGTTGA TATCAACCGT TGCAGGGGAA CTTGAAATCG CGGCCGCGGT GAAAAACTCT
CAGAAGACGG TCACGGTGAA ATTCAACGCG GATGCCAGCA CCGGTCAGGC AAACCTGCAG
GTAGACGCCG CTGCTCAAAA AGTGGCAAAC GGCAAAGATG CCTTTACGCT GACGGCGAAC
GTTGAGGATA AAAATGGTAA CCCTGTTCCA GGGAGCCTGG TGACCTTTAA TCTGCCCCGG
GGTGTCAAGC CGCTTACAGG CGATAATGTC TGGGTGAAAG CCAACGATGA GGGGAAAGCA
GAGTTGCAGG TGGTTTCCGT GACTGCCGGA ACCTATGAGA TCACGGCATC GGCAGGAAAT
GACCAGCCTT CGGATGCGCA GACTATAACG TTTGTAGCCG ATAAGACTAC CGCAACCGTC
TCCGGTATTG AGGTGATTGG CAACTATGCT CTGGCGGACG GCAAAGCCAA ACAGACGTAT
AAAGTTACGG TGACTGATGC CAATAACAAC CTGTTGAAAG ATAGCGAGGT GACGCTGACT
GCCAGCCCGG CAAATTTAGC TCTGGATCCC GATGGGACGG CGAAAACTAA TGAGCAAGGG
CAGGCTATTT TCACCGCCAC GACCACTGTC GCGGCGAAAT ATACACTCAC GGCGAAAGTG
GAACAGGCCA ACGGTCAGGA ATCGACGAAA ACTGCTGAAT CTAAATTCGT CGCGGATGAT
AAAAACGCGG TGCTCGCCGC ATCATCTGAT GTGACTTCTC TGGTGGCGGA TGGGGTACAG
ACCGCAACAA TGACGGTTAC CCTGTTCTCG GCAAATAACC CTGTTGGGGG GAATGTGTGG
GTCGACATTG AGGCTCCGGA AGGAGTGACG GAGAAGGATT ATCAATTCCT GCCGTCGAAA
AATGACCATT TCGTGAGCGG AAAAATCACG CGTACATTTA GTACCAACAA GCCAGGTACA
TACACATTCA CATTCAACTC TTTGACATAT GGAGGGTATG AAATGAAACC AGTGACTGTG
ACCATTACCG CGGTGGATGC CAATACGGCA ACGGGCGAGG AGGCGATGAA ATAA
 
Protein sequence
MSRYKTGHKQ PQFRYSVLAR CVAWANISVQ VLFPLAVTFT PVMAARAQHA VQPRLSMGNT 
TVTADSNVEK NVASFAANAG TFLSSQPDSD ATRNFITGMA TAKANQEIQE WLGKYGTARV
KLNVDKDFSL KDSSLEMLYP IYDTPTNMLF TQGAIHRTDD RTQSNIGFGW RHFSGNDWMA
GVNTFIDHDL SRSHTRIGVG AEYWRDYLKL SANGYIRASG WKKSPDIEDY QERPANGWDI
RAEGYLPAWP QLGASLMYEQ YYGDEVGLFG KDKRQKDPHA ISAEVTYTPV PLLTLSAGHK
QGKSGENDTR FGLEVNYRIG EPLAKQLDTD SIRERRMLAG SRYDLVERNN NIVLEYRKSE
VIRIALPDRI AGKGGQTVSL GLVVSKATHG LKNVQWEAPS LLAAGGKITG QGNQWQVTLP
AYQAGKDNYY AISAVAYDNK GNASKRVQTE VVITGAGMSA ERTALTLDGQ SRIQMLANGS
EQKPLVLSLR DAEGQPVTGM KDQIKTELTF KPAGNIVTRT LKATKSQAQP TLGEFTETEA
GVYQSVFTTG TQSGEATITV SVDDMSKTVT AELRATMMDV ANSTLSANEP SGDVVADGQQ
SHTLTLTAVD TDGNPVTGEA SRLRLVPQDT NGVTVGAISE IKPGVYSATV SSTRAGNVVV
RAFSEQYQLG TLQQTLKFVA GPLDAAHSSI TLNPDKPVVG GTVTAIWTAK DAYDNPVTSL
TPEAPSLAGA AAVGSTASGW TNNGDGTWTA QITLGSTAGE LDVMPKLNGQ DAAANAAKVT
VVADALSSNQ SKVSVAEDHV KAGESTTVTL IAKDAHGNAI SGLSLSASLT GAASEGATVS
SWTEKGDGSY VATLTTGGKT GELLVMPLFN GQPAATEAAQ LTVIAGEMSS ANSTLVADNK
APTVKATTEL TFTAKDAYGN PVSGLKLDAP VFSGAASTGS ERPSAGSWTE QSNGVYVATL
TLGSAAGQLS VMPRVNGQNA VAQPLVLNVA GDASKAEIRD MTVKVDNQLA NGQSTNQVTL
TVVDTYGNPL QGQEVTLTLP QGVTSKTGNT VTTNAAGKAD IELISTVAGE LEIAAAVKNS
QKTVTVKFNA DASTGQANLQ VDAAAQKVAN GKDAFTLTAN VEDKNGNPVP GSLVTFNLPR
GVKPLTGDNV WVKANDEGKA ELQVVSVTAG TYEITASAGN DQPSDAQTIT FVADKTTATV
SGIEVIGNYA LADGKAKQTY KVTVTDANNN LLKDSEVTLT ASPANLALDP DGTAKTNEQG
QAIFTATTTV AAKYTLTAKV EQANGQESTK TAESKFVADD KNAVLAASSD VTSLVADGVQ
TATMTVTLFS ANNPVGGNVW VDIEAPEGVT EKDYQFLPSK NDHFVSGKIT RTFSTNKPGT
YTFTFNSLTY GGYEMKPVTV TITAVDANTA TGEEAMK