Gene Rleg2_5790 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagRleg2_5790 
Symbol 
ID6977179 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameRhizobium leguminosarum bv. trifolii WSM2304 
KingdomBacteria 
Replicon accessionNC_011366 
Strand
Start bp199592 
End bp201106 
Gene Length1515 bp 
Protein Length504 aa 
Translation table11 
GC content64% 
IMG OID643393245 
Productcholine-sulfatase 
Protein accessionYP_002278063 
Protein GI209546173 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID[TIGR03417] choline-sulfatase 


Plasmid Coverage information

Num covering plasmid clones22 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones13 
Fosmid unclonability p-value0.0271234 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGCGCGCC CGAATATTCT CATCCTGATG GTGGATCAGC TGAACGGGAC CTTGTTTCCC 
GACGGGCCTG CCGACTTCCT GCATGCGCCG CATCTGAAAT CATTGGCCGA GCGTTCCGTG
CGCTTCGCCA ACACCTATAC GGCGAGCCCA CTCTGCGCAC CGGCGCGGGC GTCCTTCATG
TCCGGGCAAT TGCCGAGCCG GACGCGCGTC TACGACAATG CCGCCGAATT CGCCTCCGAT
ATTCCGACCT ATGCGCATCA TCTGCGCGCC GCCGGATATC AGACGGCGCT GTCGGGCAAG
ATGCATTTCG TCGGCCCCGA CCAACTTCAT GGCTTCGAGG AGCGGCTGAC GACGGATATC
TACCCCGCCG ATTTCGGCTG GACGCCCGAT TATAGCAAGC CCGGCGAGCG CATAGACTGG
TGGTATCACA ATTTGGGTTC AGTGACCGGC GCCGGCGTTG CCGAGATCAC CAACCAGATG
GAATATGACG ACGAGGTCGC CTACAACGCC ACCCGCAAGC TGTTCGATCT CTCACGCGCC
CACGACGGGC GCCCCTGGTG CCTGACCGTC AGCTTCACCC ATCCGCACGA CCCCTATGTC
GCACGCCGCA AATTCTGGGA CCTCTATGAA GACTGCCCGG CGCTCGACCC GGCGGTGGCG
CCGATTGCCT TCGAGCGACA GGACCCACAT TCTCAGCGGC TGATGAAAGC CTGCGACCAC
GAGGCTTTCG CCATCAGCCC GGAAGGAATC CGGCGGGCAA GGCGGGGCTA CTTCGCCAAT
ATCTCCTATG TCGACGAGAA GATCGGCGAC ATTCTCGGCG TGCTGGAACG GACCCGCATG
GCCGAGAACA CGATCATCCT CTTCGTTTCC GATCATGGCG ACATGCTCGG CGACCGCGGC
CTCTGGTTCA AGATGAACTT CTTCGAAGGG TCGGCGCGCG TCCCGCTGAT GATTGCGGCA
CCCGGCTGGA AGCCCGGGCG CATCGATCAT CCGGTCTCCA CCCTCGACGT CACCCCGACG
CTCGCCGGTC TTGCCGGGAT CGACATCACC GCGTTGAAAC CCTGGACCGA GGGCGAGGAT
CTCGCGGGCC TCGCCGAGGG CACCGGCAGC CGCGGTCCCG TGCCGATGGA GTATGCCGCG
GAAGGTTCCG AAGCGCCGCT CGTCTGCCTC AGGGACGGGC GCTACAAGCT TTCCCTCTGC
GACAAGGACC CGCCGATGCT GTTCGACCTC GAAGCCGATC CGCAGGAGCT CGACAATCTG
GCAGGAAATC CGGCGCATGC CGGCACTCTG GCGAGGCTCG CCGACCAGGC CGGCCGGCGC
TGGAACCTGG CCGATTTCGA TGCAGCCGTG CGCGAAAGCC AGGCGCGCCG CTGGGTGGTC
TATGCGGCGC TGCGCAACGG CGCCTATTAT CCCTGGGACT ATCAGCCTCT GCAGAAGGCC
TCGGAACGCT ACATGCGCAA CCACATGGAC CTGAACGTGC TCGAGGAAAA CCAGAGGTTT
CCGCGCCAGG AATGA
 
Protein sequence
MARPNILILM VDQLNGTLFP DGPADFLHAP HLKSLAERSV RFANTYTASP LCAPARASFM 
SGQLPSRTRV YDNAAEFASD IPTYAHHLRA AGYQTALSGK MHFVGPDQLH GFEERLTTDI
YPADFGWTPD YSKPGERIDW WYHNLGSVTG AGVAEITNQM EYDDEVAYNA TRKLFDLSRA
HDGRPWCLTV SFTHPHDPYV ARRKFWDLYE DCPALDPAVA PIAFERQDPH SQRLMKACDH
EAFAISPEGI RRARRGYFAN ISYVDEKIGD ILGVLERTRM AENTIILFVS DHGDMLGDRG
LWFKMNFFEG SARVPLMIAA PGWKPGRIDH PVSTLDVTPT LAGLAGIDIT ALKPWTEGED
LAGLAEGTGS RGPVPMEYAA EGSEAPLVCL RDGRYKLSLC DKDPPMLFDL EADPQELDNL
AGNPAHAGTL ARLADQAGRR WNLADFDAAV RESQARRWVV YAALRNGAYY PWDYQPLQKA
SERYMRNHMD LNVLEENQRF PRQE