Gene EcSMS35_0019 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_0019 
Symbol 
ID6145809 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp22139 
End bp24178 
Gene Length2040 bp 
Protein Length679 aa 
Translation table11 
GC content50% 
IMG OID641614920 
Productglycosy hydrolase family protein 
Protein accessionYP_001742136 
Protein GI170681287 
COG category[G] Carbohydrate transport and metabolism 
COG ID[COG1501] Alpha-glucosidases, family 31 of glycosyl hydrolases 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.0278482 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones58 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGTCGTTTA TTAAGCAAGA TCCCCACCGT CTGGTCTGGC AACAAGACGA TCGTTATCTC 
TGGATTGAAG CCTGGGGCGA AAACAGCTTA CGCGTGCGAA GCGGCCGCCA TTTACCGGTA
ATGCGCAATG AAAACTGGGC GTTAACTCAG GATCCTGGCG ATGCTGTCGC TCATATCACC
TGGGATGAAA AACAGGCAAC GCTGAGAAAT GGTAAAATCA CTGCCATCGT AAATCTTCAG
GGCCAGCTCT CTTTCTGGCG TAATGATGAT AAATGCCTGC TGCAAGAGTT CTGGCGTCAA
CGTGGCGAAA TTGGTGAAGA TGAGTCAGCA CACGGCCAAT ATGTGAGCGC TCTTAATCTA
CAGGCCCGTG AATTCAAGCC TATTCCTGGC GGGAAATTTA CGATTAAGGC ACGATTTGAA
GCCAACGATG GCGAGAAACT CTTTGGCATG GGTCAGTATC AGCAACCAAA CCTGGATTTA
AAAGGTTGCA TGCTGGAGCT GGCACAACGT AACTCGCAGG CTTCTGTTCC CTTCCTGCTA
TCAAATCAGG GTTATGGTTT TCTGTGGAAT AACCCAGCAA TTGGCCGCGT CACCTTCGCG
CAAAACGGTA CGGAATGGGT AGCAGAAGTC AGTGAACAAC TGGACTACTG GATCACTGCG
GGCGATACTC CGGCTGAGAT CAGTGAAGCC TATGCACGAG TAACCGGTAC GCCACCGATG
ATGCCTGATT ACGCAATGGG GTTCTGGCAA TGCAAGCTTC GCTATCGTAA TCAACAAGAA
TTGTTAGAAG TCGCACGCGG CTACAAGCAA CGCAATCTTC CGATCTCGGT GATTGTCATC
GATTTCTTCC ATTGGCCGAA TCAGGGTGAC TGGATGTTCG ATCTGCGTGA CTGGCCCGAT
CCCGATGCGA TGATCGCCGA ATTAAAAGAA ATGGGTATCG AACTGATGGT TTCATTCTGG
CCTACCGTAG ATAACCGTAC CGAGAGTTAT CGGGAGATGA AAGAAAACGG CTGGCTAGTC
CACACCGAGC GTGGTTTGCC TATCAATATG GATTTCCTTG GCAACACCAC GTTCTTTGAT
GCGACACATC CTGGCGCACG GGAATATGTC TGGAACAAAG CCAAACGCAA TTATTACGAT
AAAGGCGTGA AACTGTTCTG GCTGGATGAA GCCGAGCCAG AGTTCGGTGT TTACGATTAT
GACAACTATC GTTACTACGC AGGGCCAGTG CTGGAAGTCG GTAATATCTA TCCGCGTATG
TACGCCAAAA CGTTCTTCGA TGGCATGCAG GCAGCCGGAG AAAAGCAGGT GATCAATTTG
CTACGCTGCG CCTGGGCCGG TAGTCAGAAA TATGGTGCCC TGGTGTGGTC AGGCGATATC
CACTCCTCCT TCCGTTCTCT GCGCAATCAG TTTGCTGCGG GGCTGAACAT GGGAATTGCC
GGGATCCCGT GGTGGACGAC GGATATTGGC GGTTTCCACG GTGGCAACAT ACATGATCCA
AAATTTCATG AACTGCTTAT TCGCTGGTTC CAGTGGGGGG TATTCAGCCC AGTAATGCGT
TTACATGGCA ACCGCGACCC GCAAGTTTTG CCTGAACAAC CCTATCGTGA TGGCATCGCA
CAATGTCCGA CTGGCGCGCC GAATGAAGTC TGGAGCTATG GTGAAGAAAC CGGTGAAATT
CTCACTTCCT GCCTGCAGTT GCGTGAGAAG CTGAAACCTT ATATCAGTAA AATTATGGCT
GAGACGCACG AGAAGAACAG CCCGGTAATG CGCACGATGT TCTTTGAGTT TCCAGACCAG
GCAGAAAGCT GGAATATTTA TGATCAGTAC TGCTTTGGCC CGGACCTTCT CGTTGCCCCT
GTAATGCACG AAGGTACACG TTCACGTGAT GTCTGGTTGC CTGCCGGAGA GACATGGATC
GATTTTTATA CCCATGAACA CTACGCGGGT GGGCAAAAGC TTCATCATTG CGCGCCGCTG
CACCAGATAC CGGTATTCAT TCGTGAGAAA GGGAAATATC GCCAGTTATT ACCCGTTTAA
 
Protein sequence
MSFIKQDPHR LVWQQDDRYL WIEAWGENSL RVRSGRHLPV MRNENWALTQ DPGDAVAHIT 
WDEKQATLRN GKITAIVNLQ GQLSFWRNDD KCLLQEFWRQ RGEIGEDESA HGQYVSALNL
QAREFKPIPG GKFTIKARFE ANDGEKLFGM GQYQQPNLDL KGCMLELAQR NSQASVPFLL
SNQGYGFLWN NPAIGRVTFA QNGTEWVAEV SEQLDYWITA GDTPAEISEA YARVTGTPPM
MPDYAMGFWQ CKLRYRNQQE LLEVARGYKQ RNLPISVIVI DFFHWPNQGD WMFDLRDWPD
PDAMIAELKE MGIELMVSFW PTVDNRTESY REMKENGWLV HTERGLPINM DFLGNTTFFD
ATHPGAREYV WNKAKRNYYD KGVKLFWLDE AEPEFGVYDY DNYRYYAGPV LEVGNIYPRM
YAKTFFDGMQ AAGEKQVINL LRCAWAGSQK YGALVWSGDI HSSFRSLRNQ FAAGLNMGIA
GIPWWTTDIG GFHGGNIHDP KFHELLIRWF QWGVFSPVMR LHGNRDPQVL PEQPYRDGIA
QCPTGAPNEV WSYGEETGEI LTSCLQLREK LKPYISKIMA ETHEKNSPVM RTMFFEFPDQ
AESWNIYDQY CFGPDLLVAP VMHEGTRSRD VWLPAGETWI DFYTHEHYAG GQKLHHCAPL
HQIPVFIREK GKYRQLLPV