Gene ECH74115_5110 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagECH74115_5110 
Symbol 
ID6970733 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli O157:H7 str. EC4115 
KingdomBacteria 
Replicon accessionNC_011353 
Strand
Start bp4750491 
End bp4751984 
Gene Length1494 bp 
Protein Length497 aa 
Translation table11 
GC content52% 
IMG OID643388782 
Productsulfatase 
Protein accessionYP_002273208 
Protein GI209399130 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones31 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones49 
Fosmid unclonability p-value0.36096 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAAACGCC CCAATTTTCT GTTCATCATG ACCGATACCC AGGCCCCCAA TATGGTCGGT 
TGCTATAGCG GTAAACCGCT GAATACGCAA AATATTGATA GTCTGGCGGC GGAAGGTATT
CGCTTTAATT CCGCCTACAC CTGTTCACCG GTTTGTACAC CGGCGCGCGC CGGGCTGTTT
ACCGGTATCT ACGCTAACCA GTCCGGGCCG TGGACCAACA ACGTCGCGCC GGGTAAAAAC
ATCTCCACTA TGGGGCGCTA CTTTAAGGAT GCCGGCTATC ACACCTGTTA CATCGGCAAA
TGGCATCTCG ACGGTCATGA CTATTTCGGC ACTGGCGAGT GTCCGCCGGA GTGGGACGCT
GATTACTGGT TCGATGGGGC GAACTATCTT AGCGAACTGA CGGAAAAAGA GATTAGCCTG
TGGCGCAATG GCCTAAACAG CGTCGAAGAT TTACAGGCGA ACCATATTGA CGAAACCTTC
ACCTGGGCGC ACCGCATCAG CAATCGAGCG GTGGATTTTC TGCAACAGCC CGCGCGCGCC
GAGGAACCCT TCCTGATGGT GGTTTCGTAT GATGAGCCGC ATCACCCGTT CACCTGTCCG
GTGGAGTATT TAGAGAAATA CGCTGATTTT TACTACGAGC TGGGCGAGAA AGCACAGGAT
GACCTGGCGA ACAAACCAGA ACATCACCGC TTATGGGCGC AGGCGATGCC ATCGCCAGTC
GGTGATGACG GGCTTTATCA CCATCCGCTC TATTTTGCCT GTAATGACTT TGTTGATGAC
CAAATCGGAC GGGTCATCAA TGCCTTAAAG CCAGAGCAAC GTGAAAATAC GTGGGTTATT
TATACCTCCG ATCACGGCGA AATGATGGGC GCACATAAGC TGATCAGTAA AGGGGCGGCG
ATGTATGACG ACATCACCCG CATTCCGCTG ATCATCCGTT CGCCGCAAGG GGAGCGGCGA
CAGGTCGATA CGCCAGTCAG TCATATCGAT TTACTGCCGA CAATGATGGC GCTGGCAGAT
ATTGAAAAAC CAGAGATTCT GCCGGGGGAA AATATCCTTG CAGTGAAAGA GCCACGCGGC
GTGATGGTGG AATTTAACCG CTACGAGATT GAGCATGACA GCTTTGGCGG TTTTATTCCG
GTGCGTTGCT GGGTGACGGA TGACTTTAAA CTGGTGCTCA ACCTCTTCAC CAGTGATGAA
CTTTACGATC GCCGTAATGA TCCAAATGAA ATGCATAACC TGATCGATGA TGCCCATTTT
GCCGACGTTC GCAGCAAAAT GCATGATGCC TTATTGGATT ATATGGACAA AATTCGCGAT
CCGTTCCGCA GTTACCAATG GAGCCTGCGT CCGTGGCGTA AAGATGCACA GCCGCGCTGG
ATGGGGGCAT TTCGTCCACG CCCACAAGAT GGTTATTCGC CAGTGGTACG CGACTATGAC
ACCGGCCTGC CGACACAAGG GGTGAAGGTG GAAGAGAAAA AACAGAAGTT CTGA
 
Protein sequence
MKRPNFLFIM TDTQAPNMVG CYSGKPLNTQ NIDSLAAEGI RFNSAYTCSP VCTPARAGLF 
TGIYANQSGP WTNNVAPGKN ISTMGRYFKD AGYHTCYIGK WHLDGHDYFG TGECPPEWDA
DYWFDGANYL SELTEKEISL WRNGLNSVED LQANHIDETF TWAHRISNRA VDFLQQPARA
EEPFLMVVSY DEPHHPFTCP VEYLEKYADF YYELGEKAQD DLANKPEHHR LWAQAMPSPV
GDDGLYHHPL YFACNDFVDD QIGRVINALK PEQRENTWVI YTSDHGEMMG AHKLISKGAA
MYDDITRIPL IIRSPQGERR QVDTPVSHID LLPTMMALAD IEKPEILPGE NILAVKEPRG
VMVEFNRYEI EHDSFGGFIP VRCWVTDDFK LVLNLFTSDE LYDRRNDPNE MHNLIDDAHF
ADVRSKMHDA LLDYMDKIRD PFRSYQWSLR PWRKDAQPRW MGAFRPRPQD GYSPVVRDYD
TGLPTQGVKV EEKKQKF