Gene Rleg2_6149 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagRleg2_6149 
Symbol 
ID6983222 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameRhizobium leguminosarum bv. trifolii WSM2304 
KingdomBacteria 
Replicon accessionNC_011370 
Strand
Start bp84394 
End bp85905 
Gene Length1512 bp 
Protein Length503 aa 
Translation table11 
GC content57% 
IMG OID643399167 
Productcholine-sulfatase 
Protein accessionYP_002283923 
Protein GI209552007 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID[TIGR03417] choline-sulfatase 


Plasmid Coverage information

Num covering plasmid clones23 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones29 
Fosmid unclonability p-value0.989807 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGACAGTGC CCGAAAACCC CAACATCCTG TTCATCCAGG TCGATCAGCT CACCGCCGCG 
TCGCTAAGCG CCTACGGGGA CACGGTGTGC CGTGCCCCGA ACCTTGAACG CATCGCCGAT
ACGGGCGTGG TCTTCGAAAC CGCCTACTGC AATTTCCCCC TGTGCGCGCC ATCGCGCTTC
TCAATGGCGG CTGGGCAGCT CTGCTCGACG ATCGGTGCCT ATGACAACGC GGCCGAGATG
CCCGCATCGA TACCGACCTA TGCCCACTAC CTGCGCGCGG CCGGCTATCA GACGGCGCTT
TCCGGGAAGA TGCACTTCAT CGGGCCTGAC CAGTTTCATG GTTTCGAGAA GCGCCTAACA
CCGGACCTTT ATCCCGCCGA CTTCAGCTGG GTTCCCAATT GGGGCAATGA AGGCAAGCGC
GACACCAACG ACACGCGCGC TGTACTCATC TCGGGAATTT GCGAGCGCAG CGTCCAGATC
GATTTCGATG AGAACGTGAC GTTTCAGGCG ATCCAGCATC TCTACAACAT TGCGCGATCT
GACGACAAAC GCCCGTTCTT CCTGCAGGTA TCCTATACCC ATCCGCACGA GCCCTATCTT
TGCCGGAAAG AATTCTGGGA CCTTTATGAA GGTGTCGACG TACCGATGCC TGCGGTCGAC
GCCTTGTCCG AACAGGAGCA TGACCCACAT TCGGTCCGGC TCCTCAAAGA CTTCGCCATG
CTCGACGTCC GGTTCGCAGA TGGAGATATC CAACGGGCGC GGAGGGCATA TTACGGCTCG
ATAAGCTATA TCGACAGCAT GATCGGACAG ATTCTCGATA CACTCGAGGC TATCGGGGCG
AGGGAAAACA CCGCCATCGT CTTTGCATCC GATCATGGCG AGATGCTTGG CGAACGAGGC
ATGTGGTTCA AAAAGCATTT CTTCGAGGCA GCACTTCGCG TTCCCTTGCT GCTGAACGCA
CCGTGGATCA AGCCTCAGCG TGTCTCGGAA ACTGTTTCAC TCGTGGACTT GCTGCCCACC
TTAATGGGCT TGGCGACTGG ACGTGTGTGG CGTTCGGAGA CAGAAGAACT CGAGGGCCAG
GATTTGACCG GCTTCCTTGA CAGGGAAGAT CATGAGCCGA GCCGAGCGGT GTTCGCGGAA
TATCTGGCCG AGGCGACCCC GGTGCCGATC TTCATGGTCA GAAAGGGACG ATACAAACTG
ATCTCTTCGT CGCATGATGG AAACCTCCTC TTCGACTTGA TGGCCGATCC AAAGGAACTT
CAAAATCTCG CGGGGCACAC AGATTACGCG GAGATCGAAG CCAGGCTGCT CAAGATCGTG
GCCGACAAGT GGGACGAGGG CAAACTGACG GAAGATATCC TGCTCAGCCA GGCGCGCCGG
CTTTTTGTTC GCGAGGCGGC GAAACTGGGC ACGCCGACTA GATGGAACCA TGATGAACAG
CCAGGCCAAG AAGTGCTCTG GTACCGAGGG CAGGGAAGCT ACAACGAGTG GGCGTTCAAA
TATCTTCCAT GA
 
Protein sequence
MTVPENPNIL FIQVDQLTAA SLSAYGDTVC RAPNLERIAD TGVVFETAYC NFPLCAPSRF 
SMAAGQLCST IGAYDNAAEM PASIPTYAHY LRAAGYQTAL SGKMHFIGPD QFHGFEKRLT
PDLYPADFSW VPNWGNEGKR DTNDTRAVLI SGICERSVQI DFDENVTFQA IQHLYNIARS
DDKRPFFLQV SYTHPHEPYL CRKEFWDLYE GVDVPMPAVD ALSEQEHDPH SVRLLKDFAM
LDVRFADGDI QRARRAYYGS ISYIDSMIGQ ILDTLEAIGA RENTAIVFAS DHGEMLGERG
MWFKKHFFEA ALRVPLLLNA PWIKPQRVSE TVSLVDLLPT LMGLATGRVW RSETEELEGQ
DLTGFLDRED HEPSRAVFAE YLAEATPVPI FMVRKGRYKL ISSSHDGNLL FDLMADPKEL
QNLAGHTDYA EIEARLLKIV ADKWDEGKLT EDILLSQARR LFVREAAKLG TPTRWNHDEQ
PGQEVLWYRG QGSYNEWAFK YLP