Gene TM1040_2113 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagTM1040_2113 
Symbol 
ID4076427 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameRuegeria sp. TM1040 
KingdomBacteria 
Replicon accessionNC_008044 
Strand
Start bp2217848 
End bp2219566 
Gene Length1719 bp 
Protein Length572 aa 
Translation table11 
GC content61% 
IMG OID638007432 
Productsulfatase 
Protein accessionYP_614107 
Protein GI99081953 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones15 
Plasmid unclonability p-value0.367046 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones19 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCCACCCA GCGGCATGTT TTGCCAAAGG GTTTTTGTGC CTCCCGACCT AACCTATCGT 
CGTTTTGGCA CTCGGAACAA GGCATGCGCT ATGAATATTC TCTTTATCAT GTTCGACCAG
CTCCGGTTCG ACTACCTGAG CTGCGCGGGC CACCCGCATC TCAAGACGCC TCATATCGAC
CGGCTGGCCG AGCGCGGCGT GCGGTTCACT AACGCTTATG TGCAATCCCC GATCTGCGGC
GCCGCACGGA TGAGCTGCTA TACCGGGCGC TATGTGTCCA GCCACGGGGC GCAATGGAAC
AACTCGCCCT TGCGGGTGGG CGAATGGACC ATGGGGGATC ACCTGCGCAT GGCGGGCATG
GGCTGCTGGC TCATTGGCAA GACCCATATG AATGCCGACA GCGAGGGCAT GGCCCGCCTC
GGCCTCAGCC CCGACAGCGT GATTGGCGCA CGCCAGGCCG AATGCGGCTT TGACGTCTGG
ATCCGCGACG ATGGCCTCTG GGCCGAGGGG CCGGACGGGT TTTATGACCA AAAACGCAGC
CCCTATAATG AATACCTTAA ATCAAAAGGT TACGCGGGCC ACAACCCCTG GCACGACTTC
GCCAATGCGG GTCTTGAGGG CGAAGAAATG GCGTCCGGCT GGTTCATGGC CAATGCGCAG
AAGGCCGCGA ATATTGCCGA GGAAGACAGC GAAACCCCGT GGCTCACCAC CAAGACCATC
GAGTTCATTG AACAGGCCGA GGGGCCATGG TGCGCGCATG TGAGCTATAT CAAGCCGCAT
TGGCCCTATA TCGTGCCCGC GCCCTATCAC GACATGTACG GACCGGAGCA TGTGCTGCCC
GCCGTCAAAG ACCCCGCGGA GCGCGAAGAC CCGCACCCCG TTTACGGTGC CTTCATGGGC
AATGCCATCG GTCAGGCCTT CTCGCGCGAG GAAGTCCGAC AGGCCGCCAT CCCCGCCTAT
ATGGGCCTCA TCAAGCAATG CGACGACCAG ATGGGGCGGC TGTTTGAGTA CCTTGAGGAC
ACCGGCCGGA TGGATGACAC GATGATCGTG ATCACCTCTG ACCACGGCGA CTATCTGGGC
GATCACTGGT TGGGCGAGAA GGATCTCTTT CACGAACCCT CCGTCAAAGT GCCGATGATC
ATCTATGATC CCCGCCCCGA CGCCGACGCC ACCCGAGGCA CCACCTGCGA CGCGCTGGTG
GAAAACATCG ACCTGCTGCC CACTTTCGTG GAGGCCGCAG GCGGCGAGGT CGCAGATCAC
ATTCTGGAGG GGCGCGCGCT TACACCATGG CTGCATGGTC AGACACCCGA GGTGTGGCGG
GACTACGCAA TCAGCGAATA CGACTATTCC GGCACGCCGA TGAGTGTGAA GCTTGGCAGC
GCCCCCCGCG ATGCGCGGCT GTTTATGGTG ACGGACACAC GCTGGAAATT CATGCACGCC
GAGGGCGGCC TGCCGCCAAT GCTATTTGAT CTGGAAAACG ACCCGCAGGA ATTTCACGAC
CTTGGCCGCA GCCCAGACCA CACCGAGGTG ATCGATATGA TGTATGCGCG CCTCGGTCAG
TGGGGGCGGC GCATGTCGCA ACGCATCACC CGCTCGGACG CGCAGATCAT TGCGGGGCGC
GGCGCTTCAC GCGGCAAAGG CATTTTGCTT GGGGTCTATG AACCTGAGGA CGTCCCCGCT
GAGTTAACCG TAAAATATCG CGGCAAACCG CCGACCTGA
 
Protein sequence
MPPSGMFCQR VFVPPDLTYR RFGTRNKACA MNILFIMFDQ LRFDYLSCAG HPHLKTPHID 
RLAERGVRFT NAYVQSPICG AARMSCYTGR YVSSHGAQWN NSPLRVGEWT MGDHLRMAGM
GCWLIGKTHM NADSEGMARL GLSPDSVIGA RQAECGFDVW IRDDGLWAEG PDGFYDQKRS
PYNEYLKSKG YAGHNPWHDF ANAGLEGEEM ASGWFMANAQ KAANIAEEDS ETPWLTTKTI
EFIEQAEGPW CAHVSYIKPH WPYIVPAPYH DMYGPEHVLP AVKDPAERED PHPVYGAFMG
NAIGQAFSRE EVRQAAIPAY MGLIKQCDDQ MGRLFEYLED TGRMDDTMIV ITSDHGDYLG
DHWLGEKDLF HEPSVKVPMI IYDPRPDADA TRGTTCDALV ENIDLLPTFV EAAGGEVADH
ILEGRALTPW LHGQTPEVWR DYAISEYDYS GTPMSVKLGS APRDARLFMV TDTRWKFMHA
EGGLPPMLFD LENDPQEFHD LGRSPDHTEV IDMMYARLGQ WGRRMSQRIT RSDAQIIAGR
GASRGKGILL GVYEPEDVPA ELTVKYRGKP PT