Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | EcSMS35_4043 |
Symbol | |
ID | 6145093 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Escherichia coli SMS-3-5 |
Kingdom | Bacteria |
Replicon accession | NC_010498 |
Strand | - |
Start bp | 4133168 |
End bp | 4134661 |
Gene Length | 1494 bp |
Protein Length | 497 aa |
Translation table | 11 |
GC content | 52% |
IMG OID | 641618868 |
Product | sulfatase |
Protein accession | YP_001746006 |
Protein GI | 170681583 |
COG category | [P] Inorganic ion transport and metabolism |
COG ID | [COG3119] Arylsulfatase A and related enzymes |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 29 |
Plasmid unclonability p-value | 1 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | 49 |
Fosmid unclonability p-value | 1 |
Fosmid Hitchhiker | No |
Fosmid clonability | normal |
| |
Sequence |
Gene sequence | ATGAAACGCC CCAATTTTCT GTTCATCATG ACCGATACCC AGGCCACCAA TATGGTCGGT TGCTATAGCG GTAAGCCGCT GAATACGCAA AATATTGATA GTCTGGCGGC GGAAAGTATT CGCTTTAATT CCGCCTATAC CTGTTCACCG GTTTGTACAC CAGCGCGCGC CGGGCTGTTT ACCGGTATCT ACGCTAACCA GTCCGGCCCG TGGACCAACA ACGTCGCGCC GGGCAAAAAC ATCTCCACTA TGGGACGCTA CTTTAAGGAT GCGGGCTATC ACACCTGTTA CATCGGCAAA TGGCATCTCG ACGGTCATGA CTATTTCGGC ACTGGCGAGT GTCCGCCGGA GTGGGACGCT GATTACTGGT TCGATGGGGC GAACTACCTT AGCGAACTGA CGGAAAAAGA GATCAGCCTG TGGCGCAATG GCCTAAACAG CGTTGAGGAT TTACAGGCGA ACCATATCGA CGAAACCTTC ACCTGGGCGC ATCGCATCAG CAATCGGGCG GTAGATTTTC TGCAACAGCC CGCGCGCGCC GAGGAACCCT TCCTGATGGT GGTTTCGTAT GATGAGCCGC ATCACCCGTT CACCTGTCCG GTGGAGTATT TAGAGAAATA CGCTGATTTT TACTACGAAC TTGGCGAGAA ATCACAGGAT GACCTGGCGA ACAAACCGGA ACATCACCGC TTATGGGCGC AGGCGATGCC ATCGCCAGTC GGTGATGACG GGCTTTATCA CCATCCGCTC TATTTTGCCT GCAATGACTT TGTTGATGAC CAAATCGGAC GGGTCATCAA TGCCTTAACG CCAGAGCAAC GTGAAAATAC GTGGGTCATT TATACCTCCG ATCACGGTGA AATGATGGGC GCACATAAGT TGATCAGTAA AGGAGCGGCG ATGTATGACG ACATCACCCG TATTCCGCTG ATCATCCGTT CGCCGCAAGG GGAGCGGCGG CAGGTCGATA CGCCAGTCAG TCATATCGAT TTACTGCCGA CAATGATGGC GCTGGCAGAT ATTGAAAAAC CAGAGATTCT GCCGGGGGAA AATATCCTTG CCGTGAAAGA GCCACGCGGC GTAATGGTGG AATTTAACCG CTACGAGATT GAGCATGACA GCTTTGGCGG TTTTATTCCG GTGCGTTGCT GGGTGACGGA TGACTTTAAA CTGGTACTCA ACCTCTTCAC CAGTGATGAA CTTTACGATC GCCGTAATGA CCTAAATGAA ATGCATAATC TGATCGATGA TATCCGTTTT GCCGACGTTC GCCGCAAAAT GCATGACGCC TTATTGGATT ACATGGATAA AATTCGCGAT CCGTTCCGCA GTTACCAATG GAGTCTGCGT CCGTGGCGTA AAGATGCACG ACCGCGCTGG ATGGGGGCGT TTCGTCCACG CCCACAAGAT GGCTATTCGC CAGTGGTACG CGACTATGAC ACCGGCCTAC CGACACAAGG GGTGAAGGTG GAAGAGAAAA AACAGAAGTT CTGA
|
Protein sequence | MKRPNFLFIM TDTQATNMVG CYSGKPLNTQ NIDSLAAESI RFNSAYTCSP VCTPARAGLF TGIYANQSGP WTNNVAPGKN ISTMGRYFKD AGYHTCYIGK WHLDGHDYFG TGECPPEWDA DYWFDGANYL SELTEKEISL WRNGLNSVED LQANHIDETF TWAHRISNRA VDFLQQPARA EEPFLMVVSY DEPHHPFTCP VEYLEKYADF YYELGEKSQD DLANKPEHHR LWAQAMPSPV GDDGLYHHPL YFACNDFVDD QIGRVINALT PEQRENTWVI YTSDHGEMMG AHKLISKGAA MYDDITRIPL IIRSPQGERR QVDTPVSHID LLPTMMALAD IEKPEILPGE NILAVKEPRG VMVEFNRYEI EHDSFGGFIP VRCWVTDDFK LVLNLFTSDE LYDRRNDLNE MHNLIDDIRF ADVRRKMHDA LLDYMDKIRD PFRSYQWSLR PWRKDARPRW MGAFRPRPQD GYSPVVRDYD TGLPTQGVKV EEKKQKF
|
| |