Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | ECH74115_5110 |
Symbol | |
ID | 6970733 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Escherichia coli O157:H7 str. EC4115 |
Kingdom | Bacteria |
Replicon accession | NC_011353 |
Strand | - |
Start bp | 4750491 |
End bp | 4751984 |
Gene Length | 1494 bp |
Protein Length | 497 aa |
Translation table | 11 |
GC content | 52% |
IMG OID | 643388782 |
Product | sulfatase |
Protein accession | YP_002273208 |
Protein GI | 209399130 |
COG category | [P] Inorganic ion transport and metabolism |
COG ID | [COG3119] Arylsulfatase A and related enzymes |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 31 |
Plasmid unclonability p-value | 1 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | 49 |
Fosmid unclonability p-value | 0.36096 |
Fosmid Hitchhiker | No |
Fosmid clonability | normal |
| |
Sequence |
Gene sequence | ATGAAACGCC CCAATTTTCT GTTCATCATG ACCGATACCC AGGCCCCCAA TATGGTCGGT TGCTATAGCG GTAAACCGCT GAATACGCAA AATATTGATA GTCTGGCGGC GGAAGGTATT CGCTTTAATT CCGCCTACAC CTGTTCACCG GTTTGTACAC CGGCGCGCGC CGGGCTGTTT ACCGGTATCT ACGCTAACCA GTCCGGGCCG TGGACCAACA ACGTCGCGCC GGGTAAAAAC ATCTCCACTA TGGGGCGCTA CTTTAAGGAT GCCGGCTATC ACACCTGTTA CATCGGCAAA TGGCATCTCG ACGGTCATGA CTATTTCGGC ACTGGCGAGT GTCCGCCGGA GTGGGACGCT GATTACTGGT TCGATGGGGC GAACTATCTT AGCGAACTGA CGGAAAAAGA GATTAGCCTG TGGCGCAATG GCCTAAACAG CGTCGAAGAT TTACAGGCGA ACCATATTGA CGAAACCTTC ACCTGGGCGC ACCGCATCAG CAATCGAGCG GTGGATTTTC TGCAACAGCC CGCGCGCGCC GAGGAACCCT TCCTGATGGT GGTTTCGTAT GATGAGCCGC ATCACCCGTT CACCTGTCCG GTGGAGTATT TAGAGAAATA CGCTGATTTT TACTACGAGC TGGGCGAGAA AGCACAGGAT GACCTGGCGA ACAAACCAGA ACATCACCGC TTATGGGCGC AGGCGATGCC ATCGCCAGTC GGTGATGACG GGCTTTATCA CCATCCGCTC TATTTTGCCT GTAATGACTT TGTTGATGAC CAAATCGGAC GGGTCATCAA TGCCTTAAAG CCAGAGCAAC GTGAAAATAC GTGGGTTATT TATACCTCCG ATCACGGCGA AATGATGGGC GCACATAAGC TGATCAGTAA AGGGGCGGCG ATGTATGACG ACATCACCCG CATTCCGCTG ATCATCCGTT CGCCGCAAGG GGAGCGGCGA CAGGTCGATA CGCCAGTCAG TCATATCGAT TTACTGCCGA CAATGATGGC GCTGGCAGAT ATTGAAAAAC CAGAGATTCT GCCGGGGGAA AATATCCTTG CAGTGAAAGA GCCACGCGGC GTGATGGTGG AATTTAACCG CTACGAGATT GAGCATGACA GCTTTGGCGG TTTTATTCCG GTGCGTTGCT GGGTGACGGA TGACTTTAAA CTGGTGCTCA ACCTCTTCAC CAGTGATGAA CTTTACGATC GCCGTAATGA TCCAAATGAA ATGCATAACC TGATCGATGA TGCCCATTTT GCCGACGTTC GCAGCAAAAT GCATGATGCC TTATTGGATT ATATGGACAA AATTCGCGAT CCGTTCCGCA GTTACCAATG GAGCCTGCGT CCGTGGCGTA AAGATGCACA GCCGCGCTGG ATGGGGGCAT TTCGTCCACG CCCACAAGAT GGTTATTCGC CAGTGGTACG CGACTATGAC ACCGGCCTGC CGACACAAGG GGTGAAGGTG GAAGAGAAAA AACAGAAGTT CTGA
|
Protein sequence | MKRPNFLFIM TDTQAPNMVG CYSGKPLNTQ NIDSLAAEGI RFNSAYTCSP VCTPARAGLF TGIYANQSGP WTNNVAPGKN ISTMGRYFKD AGYHTCYIGK WHLDGHDYFG TGECPPEWDA DYWFDGANYL SELTEKEISL WRNGLNSVED LQANHIDETF TWAHRISNRA VDFLQQPARA EEPFLMVVSY DEPHHPFTCP VEYLEKYADF YYELGEKAQD DLANKPEHHR LWAQAMPSPV GDDGLYHHPL YFACNDFVDD QIGRVINALK PEQRENTWVI YTSDHGEMMG AHKLISKGAA MYDDITRIPL IIRSPQGERR QVDTPVSHID LLPTMMALAD IEKPEILPGE NILAVKEPRG VMVEFNRYEI EHDSFGGFIP VRCWVTDDFK LVLNLFTSDE LYDRRNDPNE MHNLIDDAHF ADVRSKMHDA LLDYMDKIRD PFRSYQWSLR PWRKDAQPRW MGAFRPRPQD GYSPVVRDYD TGLPTQGVKV EEKKQKF
|
| |