Gene Msil_3501 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagMsil_3501 
Symbol 
ID7092525 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameMethylocella silvestris BL2 
KingdomBacteria 
Replicon accessionNC_011666 
Strand
Start bp3846403 
End bp3848199 
Gene Length1797 bp 
Protein Length598 aa 
Translation table11 
GC content60% 
IMG OID643466792 
Productsulfatase 
Protein accessionYP_002363752 
Protein GI217979605 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clonesn/a 
Plasmid unclonability p-valuen/a 
Plasmid hitchhikingn/a 
Plasmid clonabilityn/a 
 

Fosmid Coverage information

Num covering fosmid clones26 
Fosmid unclonability p-value0.0049893 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
GTGACAGAGA AGCCCGACCG CGGCCGCAAT CTGTCGCGGC GCGCTGTCGT CGCTTCCGGG 
GCCGGCCTTC TCGGCGCCAA TCTTCTGCCC GGCGCCGCCA TGGCTGACGC GTCGACGGGC
GAGGCGGCTG GCTCCTCTCC CATCGCGCCG GATTCGCCCC CGGGCGGCTA TAATATCCTG
TTCATATTGG TCGATCAGGA GCATTTCTTC GAGGATTGGC CGATGCCCGT CCCCGGCCGC
GAGTGGATCA AGACGAACGG GGTCACCTTC GTCAATCATC AGGCGGCGTC CTGCGTCTGC
TCGCCGGCGA GGTCGACGAT CTATACCGGG CTGCACATTC AGCACACGGG CATTTTCGAC
AACGCAAATT CCCTGTGGCA GGCCGACATG TCGATGGCGG TGAAGACCAT CGGCCACCGC
ATGACCGAAC TTGGATATTA CGCTGCTTAT CAGGGCAAAT GGCATCTCAG CGTCAACCTC
GATCAGGCCA AGCACGCAAT CGATGCGCCC TTCAGCAAAT ACAGGCAGAT CATCGAGAGC
TACGGTTTTA AGGATTTCTT TGGCGTCGGA GATATCAACG ACACGACGCT CGGCGGCTAT
AATTTCGACG ACACCACCTC GGCTTTCGTC ACGCGCTGGC TGCGCACCAA GGGCGAAGAG
CTGAGGGAGG CGGGCAAGCC CTGGTATCTT GCGGTCAATT TCGTCAATCC GCATGACGTC
ATGTATGTCA ATTCGGATCT CCCCGGCGAA ACCGTTCAGG GCAAGGACAC GGCCATGGCG
ATCGCCCGGC CGCCCGCCAG CGCAATCTAT CAGGCCGAGT GGGATACGCC CTTGCCAGCC
ACGAGAAGCC AGGCGTTTGA TGCGCCGGGG CGCCCGAGCG CGCAGAAAAT CTATCAGGAC
GTCCAGGACG TTCTGGTCGG CGCATGGCCG GACGAAGACC GCCGCTGGAG GCTGCTGCGA
AACTATTATT ACAACTGCAT CCGCGACTGC GACCAGCAAG TAGTCCGCGT GCTGGATTCG
CTCAAAGCCA ATGGCATGGA CAAGAACACA ATCATCGTGT TCACCGCGGA TCATGGCGAA
CTCGGCGGCA ATCATCAAAT GCGCGGCAAG GGCAATTCCG CCTACAGGCA ACAGAACCAT
TTGCCATTGA TGATCGTCCA CCCCGCTTAT CCGGGCGGAC GAATCTGCAA GGCGGTGACG
TCGCAGATCG ATTTGACGCC GACGCTGATG GCTTTGACCG GCGCCGGCGC GCCGAGCCTA
AAGGCAGCCG GGGCGGATCT GGTCGGCCGC GATTTCTCGA GGTTGTTGGC TGCTCCGGAG
AAGGCGAGTT TTGACTCCCT GCGGCCGGGT TCATTGTACA ATTACAACAT GCTGTCGTTT
CAGGATGCGA AATGGGCCAA GAGGATGGAC GAGTTTTTGA AGCACTCGGA CATGCCGCTC
GCACAGAAAA TCGCGATTCT GCTGAAAGAG GAGCCGGATT TCCACAATCG CTGCGCGATC
CGCAGCGTCT TCGACGGGCG CTACCGCTTC AGCCGCTATT TCTCGCCATT GGCGTTCAAC
ACGCCGGCGA GTTTTGAGGA GCTTTTGGCC CAGAACGACC TCGAACTCTA CGATCTTCAG
GAGGACGAAG ACGAGGTTAC TAATCTGGCG GCGAAGCCGA AGGCCAATGC GGAGTTGATC
ATGGCGATGA ATGAAAAGCT CAACGCCCGA ATCGCGCAAG AAGTGGGCGC GGACGACGGC
GCCTTCTTGC CCTTGCGGGA CAGCAAGTGG CGCTTTCCGA GCGCCAGCGA GCGGTAG
 
Protein sequence
MTEKPDRGRN LSRRAVVASG AGLLGANLLP GAAMADASTG EAAGSSPIAP DSPPGGYNIL 
FILVDQEHFF EDWPMPVPGR EWIKTNGVTF VNHQAASCVC SPARSTIYTG LHIQHTGIFD
NANSLWQADM SMAVKTIGHR MTELGYYAAY QGKWHLSVNL DQAKHAIDAP FSKYRQIIES
YGFKDFFGVG DINDTTLGGY NFDDTTSAFV TRWLRTKGEE LREAGKPWYL AVNFVNPHDV
MYVNSDLPGE TVQGKDTAMA IARPPASAIY QAEWDTPLPA TRSQAFDAPG RPSAQKIYQD
VQDVLVGAWP DEDRRWRLLR NYYYNCIRDC DQQVVRVLDS LKANGMDKNT IIVFTADHGE
LGGNHQMRGK GNSAYRQQNH LPLMIVHPAY PGGRICKAVT SQIDLTPTLM ALTGAGAPSL
KAAGADLVGR DFSRLLAAPE KASFDSLRPG SLYNYNMLSF QDAKWAKRMD EFLKHSDMPL
AQKIAILLKE EPDFHNRCAI RSVFDGRYRF SRYFSPLAFN TPASFEELLA QNDLELYDLQ
EDEDEVTNLA AKPKANAELI MAMNEKLNAR IAQEVGADDG AFLPLRDSKW RFPSASER