Gene EcSMS35_0839 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_0839 
Symbol 
ID6145737 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp842693 
End bp844276 
Gene Length1584 bp 
Protein Length527 aa 
Translation table11 
GC content49% 
IMG OID641615727 
Productsulfatase family protein 
Protein accessionYP_001742919 
Protein GI170680896 
COG category[R] General function prediction only 
COG ID[COG2194] Predicted membrane-associated, metal-dependent hydrolase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones13 
Plasmid unclonability p-value0.195944 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones54 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAATTTAA CCCTCAAAGA CTCGCTTGTT ACCCGTAGCC GGGTATTTAG CCCGTGGACT 
GCGTTCTACT TTTTACAGTC GCTATTAATT AACCTCGGCT TAGGTTATCC CTTCAGTTTG
CTCTACACCG CTGCGTTTAC GGCTATTTTG CTTTTGCTAT GGCGAACATT GCCTCGCGTA
CAAAAAGTTC TGGTCGGTGT CAGTTCGCTG GTGGCGGCTT GTTATTTCCC TTTTGCTCAG
GCCTACGGCG CGCCTAATTT CAATACATTG CTGGCATTGC ACTCCACCAA TATGGAAGAG
TCGACCGAAA TCCTGACGAT TTTTCCGTGG TACAGCTACC TGGTCGGCTT ATTTATTTTT
GCGCTCGGCG TAATAGCAAT CAGGCGAAAA AAAGAGAATG AAAAAGCGCG CTGGAATACC
TTCGACAGCC TGTGTCTGGT ATTCAGTGTG GCGACATTTT TTGTTGCTCC CGTGCAAAAC
CTGGCCTGGG GCGGCGTATT TAAACTGAAA GATACTGGCT ATCCGGTATT TCGTTTTGCT
AAGGATGTCA TCGTCAATAA TAACGAGGTG ATCGAAGAGC AAGAACGGAT GGCAAAACTT
TCCGGAATGA AAGATACCTG GACGGTCACT GCCGTTAAGC CGAAGTATCA GACCTATGTG
GTGGTGATCG GTGAAAGCGC GCGTCGCGAT GCCCTCGGTG CCTTTGGCGG TCACTGGGAC
AATACCCCGT TTGCCAGCAG CGTTAACGGT TTGATATTTG CTGACTACAT TGCCGCCAGT
GGCTCCACGC AGAAATCGCT TGGCTTAACG CTCAATCGCG TTGTCGATGG CAAACCACAG
TTTCAGGATA ACTTTGTCAC CCTGGCAAAT CGCGCGGGCT TCCAGACCTG GTGGTTTTCC
AACCAGGGCC AAATCGGCGA ATACGATACC GCCATCGCCA GCATCGCCAA ACGTGCAGAT
GAAGTGTACT TCCTGAAAGA AGGTAATTTT GAAGCAGATA AAAACACGAA AGACGAAGCG
TTACTGGATA TGACCGCTCA AGTGCTGGCG CAAGAGCACT CGCAACCGCA GCTGATTGTT
CTACATCTGA TGGGCTCGCA TCCGCAGGCC TGCGACAGGA CACAAGGCAA ATACGAAACC
TTTGTGCAAT CAAAAGAAAC GTCGTGCTAT CTCTATACCA TGACGCAAAC GGACGATTTA
CTGCGCAAAC TGTACGATCA GTTACGCAAC AGCGGCAGCA GCTTCTCGCT GGTTTACTTT
TCTGACCACG GTCTGGCCTT TAAAGAGCGC GGCAAAGACG TGCAATACCT TGCCCATGAT
GATAAGTATC AGCAAAATTT CCAGGTGCCT TTTATGGTCA TTTCCAGCGA CGATAAAGCG
CATCGGGTGA TTAAAGCCCG CCGCTCAGCC AATGACTTCT TAGGTTTTTT CTCCCAGTGG
ACGGGAATTA AAGCGAAGGA AATTAACATC AAATACCCGT TTATATCTGA GAAGAAAGCC
GGGCCGATAT ACATCACCAA CTTCCAGTTA CAGAAGGTAG ATTACAACCA TCTCGGAACC
GATATTTTCG ACCCGAAACC TTAA
 
Protein sequence
MNLTLKDSLV TRSRVFSPWT AFYFLQSLLI NLGLGYPFSL LYTAAFTAIL LLLWRTLPRV 
QKVLVGVSSL VAACYFPFAQ AYGAPNFNTL LALHSTNMEE STEILTIFPW YSYLVGLFIF
ALGVIAIRRK KENEKARWNT FDSLCLVFSV ATFFVAPVQN LAWGGVFKLK DTGYPVFRFA
KDVIVNNNEV IEEQERMAKL SGMKDTWTVT AVKPKYQTYV VVIGESARRD ALGAFGGHWD
NTPFASSVNG LIFADYIAAS GSTQKSLGLT LNRVVDGKPQ FQDNFVTLAN RAGFQTWWFS
NQGQIGEYDT AIASIAKRAD EVYFLKEGNF EADKNTKDEA LLDMTAQVLA QEHSQPQLIV
LHLMGSHPQA CDRTQGKYET FVQSKETSCY LYTMTQTDDL LRKLYDQLRN SGSSFSLVYF
SDHGLAFKER GKDVQYLAHD DKYQQNFQVP FMVISSDDKA HRVIKARRSA NDFLGFFSQW
TGIKAKEINI KYPFISEKKA GPIYITNFQL QKVDYNHLGT DIFDPKP