Gene EcSMS35_3757 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_3757 
Symbol 
ID6144980 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp3822264 
End bp3823514 
Gene Length1251 bp 
Protein Length416 aa 
Translation table11 
GC content56% 
IMG OID641618583 
Productmajor facilitator superfamily transporter 
Protein accessionYP_001745723 
Protein GI170681523 
COG category[G] Carbohydrate transport and metabolism 
COG ID[COG2814] Arabinose efflux permease 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones27 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones59 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAAACACT GTTGTAAAAA TGTGGTGATC CTCATGCCCG AACCCGTAGC CGAACCCGCG 
CTAAACGGAT TGCGCCTGAA TTTGCGCATT GTCTCCATTG TCATGTTTAA CTTCGCCAGC
TACCTCACCA TCGGGTTGCC GCTCGCTGTA TTACCGGGCT ATGTCCATGA TGTGATGGGA
TTTAGTGCCT TCTGGGCAGG GTTGGTTATC AGCCTGCAAT ATTTCGCCAC CTTGCTGAGC
CGTCCTCATG CCGGACGTTA CGCCGATTTG CTGGGACCCA AAAAGATTGT CGTCTTCGGT
TTATGCGGCT GCTTTTTGAG CGGTCTGGGA TATCTGACGG CGGGATTAAC CGCCAGTCTG
CCCGTCATCA GCCTGTTATT ACTGTGCCTG GGGCGCGTAA TCCTTGGGAT TGGGCAAAGT
TTTGCCGGAA CGGGATCGAC CCTGTGGGGT GTTGGCGTGG TTGGTTCGCT ACATATCGGG
CGGGTGATTT CGTGGAACGG TATTGTCACT TACGGGGCGA TGGCGATGGG TGCGCCGTTA
GGTGTCGTGT TTTATCACTG GGGCGGATTG CAGGCGTTAG CGTTAATCAT TATGGGCGTG
GCGCTGGTGG CCATTTTGTT GGCGATCCCG CGTCCGATGG TAAAAGCCAG TAAAGGCAAA
CCGCTGCCGT TTCGTGCGGT GCTTGGGCGC GTCTGGCTGT ACGGTATGGC GCTGGCACTG
GCTTCCGCCG GATTTGGCGT TATCGCCACC TTTATCACGC TGTTTTATGA CGCTAAAGGT
TGGGACGGTG CGGCTTTCGC GCTGACGCTG TTTAGCTGTG CGTTTGTCGG TACGCGTTTG
TTATTCCCTA ACGGCATTAA CCGTATCGGC GGCTTAAACG TGGCGATGAT TTGCTTTAGC
GTTGAGATAA TCGGCCTGCT ACTGGTTGGC GTGGCGACTA TGCCGTGGAT GGCGAAAATC
GGCGTCTTAC TCGCGGGGGC GGGGTTTTCG CTGGTGTTCC CGGCATTGGG CGTAGTGGCG
GTAAAAGCGG TTCCGCAGCA AAATCAGGGG GCGGCGCTGG CAACTTACAC CGTATTTATG
GATTTATCGC TTGGCGTGAC CGGACCACTG GCTGGGCTGG TGATGAGTTG GGCGGGCGTC
CCGGTGATTT ATCTGGCGGC GGCGGGACTG GTCGCAATCG CGTTATTACT GACGTGGCGA
TTAAAAAAAC GGCCTCCGGT GGAAGTACCT GAGGCCATCT CATCATCTTA A
 
Protein sequence
MKHCCKNVVI LMPEPVAEPA LNGLRLNLRI VSIVMFNFAS YLTIGLPLAV LPGYVHDVMG 
FSAFWAGLVI SLQYFATLLS RPHAGRYADL LGPKKIVVFG LCGCFLSGLG YLTAGLTASL
PVISLLLLCL GRVILGIGQS FAGTGSTLWG VGVVGSLHIG RVISWNGIVT YGAMAMGAPL
GVVFYHWGGL QALALIIMGV ALVAILLAIP RPMVKASKGK PLPFRAVLGR VWLYGMALAL
ASAGFGVIAT FITLFYDAKG WDGAAFALTL FSCAFVGTRL LFPNGINRIG GLNVAMICFS
VEIIGLLLVG VATMPWMAKI GVLLAGAGFS LVFPALGVVA VKAVPQQNQG AALATYTVFM
DLSLGVTGPL AGLVMSWAGV PVIYLAAAGL VAIALLLTWR LKKRPPVEVP EAISSS