Gene TM1040_3133 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagTM1040_3133 
Symbol 
ID4075005 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameRuegeria sp. TM1040 
KingdomBacteria 
Replicon accessionNC_008043 
Strand
Start bp108763 
End bp110277 
Gene Length1515 bp 
Protein Length504 aa 
Translation table11 
GC content58% 
IMG OID638004636 
Productsulfatase 
Protein accessionYP_611369 
Protein GI99078111 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID[TIGR03417] choline-sulfatase 


Plasmid Coverage information

Num covering plasmid clones25 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones18 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGACATTGC CAAACATCCT CATTTTTATG GTCGATCAGT TGAACGGGAC CCTGTTCCCG 
GATGGCCCCG CAGAATGGCT GCACGCACCA AACATGAAGA AACTGGCCGC GCGGTCTACC
CGGTTTCGCA ATTGCTATAC CGCCAGCCCG CTCTGTGCGC CGGGTCGGGC CAGTTTCATG
TCCGGGCAGC TGCCGTCTGC CACGGGCGTC TACGACAACG CGGCGGAATT CGCCTCTTCA
ATCCCGACGT ATGCCCATCA TCTGCGCCGC GCAGGCTATT ACACCTGCCT ATCGGGCAAG
ATGCATTTTG TCGGCCCAGA TCAGCTTCAT GGCTTTGAAG AACGTCTGAC AACCGATATC
TACCCACCCG ATTTCGGTTG GACCCCGGAC TATCGCAAAC CCGGCGAGCG CATCGACTGG
TGGTATCACA ACATGGGGTC GGTCACCGGC GCCGGGGTGG CGGAGATTTC GAACCAGATG
GAGTTTGATG ACGAGGTCGC CTTTCACGCG ACCCAAAAGA TCTACGACCT GGCGCGCGGC
AAGGACGCCC GGCCGTGGTG CCTCACCGTC AGCTTTACGC ACCCCCATGA TCCCTATGTG
ACTCGTAAAA AATACTGGGA TCTATACGAG GATTGCCCGC ATCTTATGCC GGAGGTCGCG
GATCTCGGCT ATGAGAACCA GGATCCGCAC TCGAAACGGA TCTTTGACGC AAATGACTGG
CGCAACTTTG ACATCACCGA AGAAGACATC CGCAGGTCGC GTCGCGCGTA TTTCGGCAAT
ATCTCCTATC TCGACGACAA GATCGGCGAG GTCATGGAAG CGCTGGAAGG AACGCGTCAG
GACAAGGATA CGATCATTCT CTTTGTCTCG GATCACGGCG ACATGCTGGG AGAGCGCGGC
CTGTGGTTCA AGATGAGCTT TTATGAGGGG TCCTCACGCG TTCCGATGAT GATTTCAGCG
CCCAATATGA CCCCTGGCCT GGTTTGCGAT CCGGTCTCCA ACATCGATGT CTGTCCAACG
CTTTGCGATC TGGCAGGTGT GAGCATGTCC GAGGTAATGC CTTGGACCGC TGGGGAAAGC
CTGGTCCCGC TTGGCCAAGG TGGCACGCGC AGCACGCCGG TGGCGATGGA ATATGCAGCC
GAAGCCTCTT ATGCCCCGAT GGTCTCCTTG CGGTCGGGGC GCTACAAGCT CAATCTTTGT
GCGCTTGATC CGGACCAGCT GTTTGATCTG GACGCCGACC CACATGAACG GGTGAATCTC
GCCAAAGATC CCACCCACCA CGAGGCTTAT CAGGCGCTCA AGGCGATTGC GGCCGAGCGC
TGGGATCTGG ATCGATTTGA CGCCGATGTG CGCGCCAGCC AGGCGCGGCG CTGGGTGGTA
TATGAGGCGC TCCGCCAGGG CGGCTATTTC CCGTGGGATT ATCAACCCCT GCAAAAAGCG
TCCGAACGCT ACATGCGCAA CCATATGGAT TTGAATGTGG TCGAAGACCA AGCCCGCTAC
CCGCGCGGAG AATAA
 
Protein sequence
MTLPNILIFM VDQLNGTLFP DGPAEWLHAP NMKKLAARST RFRNCYTASP LCAPGRASFM 
SGQLPSATGV YDNAAEFASS IPTYAHHLRR AGYYTCLSGK MHFVGPDQLH GFEERLTTDI
YPPDFGWTPD YRKPGERIDW WYHNMGSVTG AGVAEISNQM EFDDEVAFHA TQKIYDLARG
KDARPWCLTV SFTHPHDPYV TRKKYWDLYE DCPHLMPEVA DLGYENQDPH SKRIFDANDW
RNFDITEEDI RRSRRAYFGN ISYLDDKIGE VMEALEGTRQ DKDTIILFVS DHGDMLGERG
LWFKMSFYEG SSRVPMMISA PNMTPGLVCD PVSNIDVCPT LCDLAGVSMS EVMPWTAGES
LVPLGQGGTR STPVAMEYAA EASYAPMVSL RSGRYKLNLC ALDPDQLFDL DADPHERVNL
AKDPTHHEAY QALKAIAAER WDLDRFDADV RASQARRWVV YEALRQGGYF PWDYQPLQKA
SERYMRNHMD LNVVEDQARY PRGE