Gene EcSMS35_2338 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_2338 
Symbol 
ID6142763 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp2370036 
End bp2371796 
Gene Length1761 bp 
Protein Length586 aa 
Translation table11 
GC content52% 
IMG OID641617212 
Productsulfatase family protein 
Protein accessionYP_001744385 
Protein GI170681591 
COG category[R] General function prediction only 
COG ID[COG3083] Predicted hydrolase of alkaline phosphatase superfamily 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones11 
Plasmid unclonability p-value0.0980627 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones38 
Fosmid unclonability p-value0.0832613 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGTAACTC ATCGTCAGCG CTACCGTGAA AAAGTCTCCC AGATGGTCAG TTGGGGGCAC 
TGGTTTGCAC TGTTCAATAT TCTGCTTTCG CTCGTCATTG GCAGCCGTTA CCTGTTTATC
GCCGACTGGC CGACAACGCT TGCTGGTCGC ATTTATTCCT ACGTAAGCAT TATCGGTCAT
TTCAGCTTCC TGGTGTTCGC CACCTACTTG CTGATCCTCT TCCCGCTGAC CTTTATCGTC
GGCTCCCAGA GGCTGATGAG GTTTTTGTCC GTCATTCTGG CAACGGCGGG AATGACGCTA
TTACTGATCG ATAGCGAAGT CTTTACTCGT TTCCATCTCC ATCTTAATCC CATCGTCTGG
CAACTGGTTA TCAACCCAGA CGAAAATGAG ATGGCGCGCG ACTGGCAGCT GATGTTCATC
AGCGTGCCGG TTATTTTATT GCTTGAACTG GTGTTTGCGA CGTGGAGCTG GCAAAAGCTG
CGCAGCCTGA CGCGTCGTCG ACGCTTCGCG CGCCCGCTGG CCGCATTCTT ATTTATCGCC
TTTATCGCCT CGCATGTGGT GTATATCTGG GCCGATGCCA ACTTCTATCG CCCGATCACC
ATGCAACGCG CTAACCTGCC GCTTTCGTAC CCGATGACGG CGCGACGTTT TCTTGAGAAG
CATGGTCTGC TTGATGCGCA GGAGTATCAA CGCCGTCTCA TTGAGCAAGG TAATCCTGAC
GCCGTTTCCG TTCAGTATCC GTTAAGCGAA CTGCGCTATC GCGATATGGG CACCGGGCAG
AATGTGCTGT TGATTACTGT CGATGGCCTG AACTACTCAC GCTTCGAGAA GCAGATGCCT
GCGCTGGCAG GTTTTGCTGA GCAAAATATT TCGTTCACGC GCCATATGAG CTCCGGCAAC
ACTACAGACA ACGGCATCTT TGGCCTGTTC TATGGCATCT CGCCGAGCTA TATGGATGGC
ATTCTGTCGA CCCGTACGCC TGCGGCATTA ATTACTGCGC TTAATCAACA AGGCTATCAG
CTGGGATTAT TCTCATCAGA TGGCTTTACC AGCCCGCTGT ATCGCCAGGC ATTGTTGTCA
GATTTCTCGA TGCCGAGCGT ACGCACCCAA TCCGACGAGC AGACCGCCAC GCAGTGGATC
AACTGGCTGG GCCGCTACGC ACAAGAAGAT AACCGCTGGT TCTCGTGGGT TTCTTTCAAT
GGCACTAACA TTGACGACAG CAATCAGCAG GCATTTGCAC GGAAATATAG CCGGGCGGCA
GGCAATGTCG ACGACCAGAT CAACCGCGTG CTCAATGCAC TGCGTGATTC TGGCAAACTG
GACAATACGG TGGTGATTAT CACTGCCGGT CGGGGTATTC CGCTGAGCGA AGAGGAAGAA
ACCTTTGACT GGTCCCACGG TCATCTGCAG GTGCCATTAG TGATTCACTG GCCAGGCACG
CCGGCGCAGC GTATTAATGC TCTGACTGAT CATACCGATC TGATGACGAC GCTGATGCAA
CGCCTGCTAC ATGTCAGCAC ACCTGCCAGC GAATATTCGC AAGGTCAGGA TTTGTTCAAC
CCTCAACGCC GTCATTACTG GGTTACCGCC GCGGATAACG ATACGCTGGC AATTACCACC
CCGAAAAAGA CGCTGGTGCT GAACAATAAC GGTAAATACC GTACTTACAA CTTACGTGGT
GAAAGAGTGA AAGATGAAAA ACCACAGTTA AGTTTGTTAT TGCAAGTACT GACAGACGAG
AAGCGTTTTA TCGCTAACTG A
 
Protein sequence
MVTHRQRYRE KVSQMVSWGH WFALFNILLS LVIGSRYLFI ADWPTTLAGR IYSYVSIIGH 
FSFLVFATYL LILFPLTFIV GSQRLMRFLS VILATAGMTL LLIDSEVFTR FHLHLNPIVW
QLVINPDENE MARDWQLMFI SVPVILLLEL VFATWSWQKL RSLTRRRRFA RPLAAFLFIA
FIASHVVYIW ADANFYRPIT MQRANLPLSY PMTARRFLEK HGLLDAQEYQ RRLIEQGNPD
AVSVQYPLSE LRYRDMGTGQ NVLLITVDGL NYSRFEKQMP ALAGFAEQNI SFTRHMSSGN
TTDNGIFGLF YGISPSYMDG ILSTRTPAAL ITALNQQGYQ LGLFSSDGFT SPLYRQALLS
DFSMPSVRTQ SDEQTATQWI NWLGRYAQED NRWFSWVSFN GTNIDDSNQQ AFARKYSRAA
GNVDDQINRV LNALRDSGKL DNTVVIITAG RGIPLSEEEE TFDWSHGHLQ VPLVIHWPGT
PAQRINALTD HTDLMTTLMQ RLLHVSTPAS EYSQGQDLFN PQRRHYWVTA ADNDTLAITT
PKKTLVLNNN GKYRTYNLRG ERVKDEKPQL SLLLQVLTDE KRFIAN