Gene EcSMS35_3333 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_3333 
Symbol 
ID6145661 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp3410588 
End bp3412384 
Gene Length1797 bp 
Protein Length598 aa 
Translation table11 
GC content49% 
IMG OID641618162 
Productarylsulfate sulfotransferase 
Protein accessionYP_001745312 
Protein GI170680035 
COG category 
COG ID 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones27 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones53 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGTTTGATA AATATAGAAA AACACTCGTA GCCGGAACTG TGGCGATAAC CCTGGGTTTG 
TCAGCATCGG GGGTGATGGC TGCGGGTTTT AAACCAGCGC CGCCTGCCGG GCAACTGGGT
GCGGTCATTG TCGATCCCTA CGGCAATGCA CCACTGACCG CTTTGGTTGA CTTAGATAGC
CATGTTATTT CTGACGTCAG AGTTACCGTC CATGGGAAGG GCGAAAAAGG CGTAGAAATC
AGCTATCCCG TGGGTCAGGA ATCACTAAAA ACTTACGATG GTGTACCGAT TTTTGGTCTT
TATCAGAAAT TTGCTAACAA AGTGACCGTT GAGTGGAAAG AAAACGGCAA GGTCATGAAA
GATGATTATG TGGTGCACAC TTCGGCCATC GTCAATAATT ACATGGATAA CCGCTCTATC
TCCGATTTAC AACAGACCAA AGTTATTAAA GTCGCACCGG GTTTTGAAGA TCGCCTCTAT
CTGGTTAATA CCCACACCTT TACCGCCCAA GGTTCCGATC TCCACTGGCA TGGTGAGAAA
GATAAAAATG CCGGTATCCT TGATGCGGGT CCGGCAACTG GCGCACTCCC TTTTGATATC
GCGCCATTCA CCTTTATCGT CGATACGGAA GGCGAATACC GCTGGTGGTT GGATCAAAAC
ACCTTCTACG ATGGTCGTGA CCGCAACATT AACAAACGTG GTTATCTGAT GGGTATCCGC
GAAACGCCAC GCGGCACCTT TACCGCTGTA CAAGGTCAGC ACTGGTACGA GTTCGACATG
ATGGGGCAGG TGCTCGAAGA TCACAAACTA CCGCGCGGAT TTGCTGACGC TACTCATGAA
TCCATTGAGA CGCCAAATGG CACGGTACTG TTGCGCGTAG GTAAGAGTAA CTATCGTCGC
GATGACGGCG TACACGTCAC CACCATTCGT GACCATATCC TCGAAGTCGA TAAATCTGGT
CGCGTTGTAG ATGTATGGGA TCTGACGAAG ATCCTCGATC CGAAACGCGA TGCACTGCTC
GGCGCGCTGG ATGCAGGTGC AGTTTGCGTT AACGTTGACC TTGCCCATGC AGGACAACAG
GCAAAACTGG AACCAGATAC ACCGTTCGGC GACGCTCTGG GTGTAGGGCC AGGCCGTAAC
TGGGCGCACG TTAATTCCAT CGCTTATGAC GCAAAAGATG ACTCAATTAT TCTCTCTTCT
CGTCACCAGG GTGTTGTGAA GATTGGTCGT GATAAGCAAG TGAAATGGAT CCTTGCACCC
TCTAAAGGTT GGGAAAAACC GCTGGCCAGC AAGCTGCTGA AACCGGTTGA TGCTAACGGT
AAGCCAATTA CCTGTAACGA AAATGGCCTG TGCGAAAACT CAGACTTCGA CTTTACCTAC
ACCCAGCATA CCGCCTGGAT TTCCAGCAAA GGAACGCTCA CCATTTTTGA TAATGGCGAT
GGTCGTCATC TGGAACAACC TGCCTTACCA ACCATGAAAT ATTCCCGCTT TGTGGAATAT
AAGATTGATG AGAAGAAAGG CACCGTTCAG CAAGTGTGGG AATACGGTAA AGAACGTGGC
TACGATTTCT ATAGCCCAAT CACCTCCATC ATTGAATATC AAGCCGACCG TAACACCATG
TTTGGCTTCG GTGGTTCTAT TCATTTGTTC GATGTCGGGC AGCCAACCGT CGGTAAGTTG
AACGAAATCG ATTACAAAAC CAAAGAAGTG AAAGTGGAAA TCGACGTGCT GTCAGATAAA
CCCAATCAGA CTCACTATCG TGCATTGTTA GTCCGTCCAC AACAGATGTT CAAATAA
 
Protein sequence
MFDKYRKTLV AGTVAITLGL SASGVMAAGF KPAPPAGQLG AVIVDPYGNA PLTALVDLDS 
HVISDVRVTV HGKGEKGVEI SYPVGQESLK TYDGVPIFGL YQKFANKVTV EWKENGKVMK
DDYVVHTSAI VNNYMDNRSI SDLQQTKVIK VAPGFEDRLY LVNTHTFTAQ GSDLHWHGEK
DKNAGILDAG PATGALPFDI APFTFIVDTE GEYRWWLDQN TFYDGRDRNI NKRGYLMGIR
ETPRGTFTAV QGQHWYEFDM MGQVLEDHKL PRGFADATHE SIETPNGTVL LRVGKSNYRR
DDGVHVTTIR DHILEVDKSG RVVDVWDLTK ILDPKRDALL GALDAGAVCV NVDLAHAGQQ
AKLEPDTPFG DALGVGPGRN WAHVNSIAYD AKDDSIILSS RHQGVVKIGR DKQVKWILAP
SKGWEKPLAS KLLKPVDANG KPITCNENGL CENSDFDFTY TQHTAWISSK GTLTIFDNGD
GRHLEQPALP TMKYSRFVEY KIDEKKGTVQ QVWEYGKERG YDFYSPITSI IEYQADRNTM
FGFGGSIHLF DVGQPTVGKL NEIDYKTKEV KVEIDVLSDK PNQTHYRALL VRPQQMFK