Gene ECH74115_1018 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagECH74115_1018 
Symbol 
ID6970024 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli O157:H7 str. EC4115 
KingdomBacteria 
Replicon accessionNC_011353 
Strand
Start bp1030272 
End bp1031753 
Gene Length1482 bp 
Protein Length493 aa 
Translation table11 
GC content52% 
IMG OID643385031 
Productsulfatase 
Protein accessionYP_002269531 
Protein GI209399668 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones28 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones45 
Fosmid unclonability p-value0.176876 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAGAGCGA TTATTCTGCT GTTCGATAGT CTGAATAAAC GTTATCTGCC ACCTTATGGC 
GATGCGTTAA CCAAAGCGCC TAATTTCCAA CGTCTGGCGG CTCATGCCGC CACCTTTGAA
AACAGTTACG TCGGCAGTAT GCCGTGTATG CCTGCCCGGC GGGAGTTGCA TACCGGGCGA
TGTAACTTCC TGCATCGGGA GTGGGGGCCG TTAGAACCGT TTGACGATTC CATGCCGGAG
CTGCTTAAAA AGGCGGGGAT CTATACCCAC CTGATAAGCG ATCATTTGCA TTACTGGGAA
GATGGCGGCG GGAATTATCA TAATCGCTAC AGTTCGTGGG AGATTGTGCG CGGACAGGAA
GGCGATCACT GGCATGCAAG CGTTGCGCAA CCGCCTATTC CCGAGGTGCT ACGGGTGCCG
CAAAAGCAGA CCGGTGGCGG TGTTTCTGGT CTGTGGCGCC ATGACTGGGC AAACCGGGAA
TATATTCAGC AGGAAGCGGA TTTTCCACAA ACTAAAGTGT TTGATGCAGG CTGTGCCTTT
ATCCACAAAA ATCACGCGGA AGATAACTGG TTATTGCAGA TCGAAACGTT TGATCCGCAC
GAGCCGTTTC ACACAACGGA AGAGTATCTT TCCTTGTATG AAGATAACTG GGATGGACCG
CATTATGACT GGCCGCGTGG CCGGGTGCAG GAAAGCGACG AGGCCGTGGA GCACATCCGT
TGCCGGTATC GTTCGCTGGT GTCGATGTGC GATCGCAACC TGGGGCGCAT TCTCGACCTG
ATGGATGAGC ACGATCTGTG GCGAGATACC ATGCTGATTG TCGGCACTGA CCACGGCTTT
TTACTCGGTG AGCATGGCTG GTGGGCGAAA AACCAGATGC CTTATTACAA CGAAGTTGCC
AATAATCCGT TGTTTATCTG GAACCCGCGC AGTGGTGTAA AAGGAGAGCG GCGACAGGCA
CTGGTACAAA TGATCGACTG GGCGCCAACG TTGTATGACT TTTTTCAACA GCCAGTGCCG
CCCGATGTGC AGGGGCAACC GCTGGCGAAA ACGGTCAGTC ACGATGAACC AGTACGCAGC
TCGGCGATGT TTGGTGTTTT CAGTGGTCAT GCTAACGTAA CTGATGGGCG TTATGTGTAT
ATGCGTGCAG CGCTGCCGGG GCGTGAGGAT GATATTGCCA ACTACACGTT GATGTCCTGC
AAAATGAACA GCCGCTATCC GGTGGATGAG ATGCGGGCTT TATCGCTGGC CCCACCGTTT
CGTTTTACCA AAGGGTTACA GGTATTACGC ATCCCGGCAC AGGAAAAATA TAAGGGGTTG
AATCAGTTTG GTCATTTGCT GTTTGATCTG CAAAACGATC CGCAGCAGCT ACATCCGATT
CATGATGATG TGATCGAGTC CCGGATGATC GCGTTGCTGA TTCAGTTGAT GAAAGATAAT
GATGCGCCGC CAGAGCAGTT TCAGCGCCTG GGATTAGCGT AG
 
Protein sequence
MRAIILLFDS LNKRYLPPYG DALTKAPNFQ RLAAHAATFE NSYVGSMPCM PARRELHTGR 
CNFLHREWGP LEPFDDSMPE LLKKAGIYTH LISDHLHYWE DGGGNYHNRY SSWEIVRGQE
GDHWHASVAQ PPIPEVLRVP QKQTGGGVSG LWRHDWANRE YIQQEADFPQ TKVFDAGCAF
IHKNHAEDNW LLQIETFDPH EPFHTTEEYL SLYEDNWDGP HYDWPRGRVQ ESDEAVEHIR
CRYRSLVSMC DRNLGRILDL MDEHDLWRDT MLIVGTDHGF LLGEHGWWAK NQMPYYNEVA
NNPLFIWNPR SGVKGERRQA LVQMIDWAPT LYDFFQQPVP PDVQGQPLAK TVSHDEPVRS
SAMFGVFSGH ANVTDGRYVY MRAALPGRED DIANYTLMSC KMNSRYPVDE MRALSLAPPF
RFTKGLQVLR IPAQEKYKGL NQFGHLLFDL QNDPQQLHPI HDDVIESRMI ALLIQLMKDN
DAPPEQFQRL GLA