Gene ECH74115_5237 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagECH74115_5237 
SymbolaslA 
ID6970829 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli O157:H7 str. EC4115 
KingdomBacteria 
Replicon accessionNC_011353 
Strand
Start bp4881089 
End bp4882744 
Gene Length1656 bp 
Protein Length551 aa 
Translation table11 
GC content53% 
IMG OID643388902 
Productarylsulfatase 
Protein accessionYP_002273316 
Protein GI209399246 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones11 
Plasmid unclonability p-value0.0995003 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones61 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGAATTTT CGTTTTCACC CAAACTTCTT GTTGTTGCTG TCGCCGCCGC TCTTCCTCTC 
ATGGCCAGCG CAGCAGATAC CCCGTCAACT GCCACCGCGC GCAAAGGCTT TGCCGGATAC
GATCACCCAA ACCAGTATCT GGTTAAACCG GCGACCACTA TTGCCGACAA CATGATGCCA
GTAATGCAGC ATCCGGCGCA GGATAAAGAA CCCCAGCAGA AGCTGGCAGA ACTTGAGAAA
AAAACCGGTA AGAAACCGAA TGTGGTTGTT TTCTTGCTGG ACGATGTGGG CTGGATGGAC
GTCGGTTTTA ACGGTGGCGG CGTGGCGGTG GGTAACCCTA CACCAGATAT CGACGCCGTT
GCCAGCCAGG GGCTGATTTT AACTTCGGCG TATTCTCAAC CAAGCTCTTC CCCAACCCGC
GCCACCATTC TCACCGGACA ATACTCCATC CACCACGGCA TTCTGATGCC GCCAATGTAC
GGGCAACCGG GCGGGCTGCA AGGGTTAACC ACGCTGCCGC AGTTGCTGCA CGATCAGGGC
TACGTCACTC AGGCCATCGG GAAATGGCAT ATGGGGGAAA ACAAAGAGTC GCAGCCGCAG
AACGTTGGCT TTGATGATTT CCGTGGCTTT AACTCGGTGT CTGATATGTA CACCGAATGG
CGCGACGTTC ACGTCAATCC GGAAGTGGCC CTGAGTCCGG ACCGTTCTGA ATACATCAAG
CAATTACCAT TTAGCAAAGA TGATGTTCAT GCGGTGCGCG GCGGCGAACA ACAGGCCATT
GCCGACATTA CGCCGAAATA TATGGAAGAT CTGGATCAAC GCTGGATGGA ATATGGCGTT
AAGTTCCTCG ACAAGATGGC GAAGAGTGAT AAGCCTTTCT TCCTCTATTA CGGCACTCGC
GGCTGTCACT TCGATAACTA CCCGAACGCC AAATATGCGG GTAGCTCTCC GGCACGCACC
TCGTATGGCG ACTGCATGGT GGAAATGAAC GATGTGTTCG CTAATCTGTA TAAAGCACTG
GAGAAAAACG GTCAGCTTGA TAACACGCTG ATTGTCTTTA CCTCCGATAA CGGACCGGAA
GCCGAAGTAC CGCCGCACGG ACGCACACCG TTCCGTGGTG CGAAAGGATC TACCTGGGAA
GGCGGCGTTC GCGTACCGAC TTTCGTTTAC TGGAAAGGCA TGATCCAACC GCGTAAATCT
GACGGTATTG TTGATCTGGC AGATCTCTTC CCAACAGCAC TGGATCTGGC GGGGCATCCT
GGTGCGAAAG TGGCGAATTT AGTGCCGAAA ACCACCTTTA TCGATGGTGT GGACCAGACA
TCTTTCTTCC TGGGAACAAA TGGTCAGTCT AACCGTAAGG CCGAGCACTA CTTCCTCAAC
GGTAAACTCG CTGCTGTGCG TATGGATGAA TTCAAGTATC ACGTCCTGAT TCAGCAACCT
TACGCTTATA CCCAGAGCGG ATATCAGGGT GGATTCACCG GCACAGTAAT GCAAACAGCC
GGATCGTCGG TGTTTAACCT CTACACCGAT CCGCAGGAGA GCGACTCTAT CGGCGTGCGC
CATATTCCGA TGGGTGTACC GCTACAGACC GAAATGCACG CGTATATGGA GATCCTGAAA
AAGTATCCAC CACGCGCGCA GATTAAATCT GATTAA
 
Protein sequence
MEFSFSPKLL VVAVAAALPL MASAADTPST ATARKGFAGY DHPNQYLVKP ATTIADNMMP 
VMQHPAQDKE PQQKLAELEK KTGKKPNVVV FLLDDVGWMD VGFNGGGVAV GNPTPDIDAV
ASQGLILTSA YSQPSSSPTR ATILTGQYSI HHGILMPPMY GQPGGLQGLT TLPQLLHDQG
YVTQAIGKWH MGENKESQPQ NVGFDDFRGF NSVSDMYTEW RDVHVNPEVA LSPDRSEYIK
QLPFSKDDVH AVRGGEQQAI ADITPKYMED LDQRWMEYGV KFLDKMAKSD KPFFLYYGTR
GCHFDNYPNA KYAGSSPART SYGDCMVEMN DVFANLYKAL EKNGQLDNTL IVFTSDNGPE
AEVPPHGRTP FRGAKGSTWE GGVRVPTFVY WKGMIQPRKS DGIVDLADLF PTALDLAGHP
GAKVANLVPK TTFIDGVDQT SFFLGTNGQS NRKAEHYFLN GKLAAVRMDE FKYHVLIQQP
YAYTQSGYQG GFTGTVMQTA GSSVFNLYTD PQESDSIGVR HIPMGVPLQT EMHAYMEILK
KYPPRAQIKS D