Gene Plav_2014 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagPlav_2014 
Symbol 
ID5455871 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameParvibaculum lavamentivorans DS-1 
KingdomBacteria 
Replicon accessionNC_009719 
Strand
Start bp2201083 
End bp2202873 
Gene Length1791 bp 
Protein Length596 aa 
Translation table11 
GC content61% 
IMG OID640877591 
Productsulfatase 
Protein accessionYP_001413285 
Protein GI154252461 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones19 
Plasmid unclonability p-value0.124362 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones69 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGGGATGA ACATCTTTAA AAGCTCGAAG CTGGCGGCGG TCATGCTGCC TTTCATCGCT 
CTCTCCGCGC AGGCGCAGGA AGCGCGGCCC AACATCGTGC TGATCCTTGC GGACGATGTG
GGCTTCAGCG ATTTCGGCGT TTACGGCAGC GAGATCGAGA CGCCGAATAT CGATGCACTG
GCGGCGAGGG GTACGCTGTT CTCGAACTTC CACGCCTCAC CCATCTGCGC GCCCTCGCGC
GCCATGCTGA TGACGGGTGT CGACAGCCAT CTCGCCGGTA TCGGCAATCT GCCGGAATCG
GCGCCGCTAG AACATCGCGG CCAGCCCGGC TATCTCGGCC GCCTGGCCGA TGATGTGGTG
ACGGTGGCGA GCCGGCTTAG CGCGGCCGGT TACCGCACTA TGATGACCGG CAAGTGGCAT
CTCGGCCATG GCGAAGGGGC GCATCCCGCA GCACGCGGCT TTGATCGCTC CTTCGCGCTC
GAAGCGTCAG GCGCCGACAA CTGGGAAAAG AAATCCTATC TGCCGATCTA TGACGACGCG
CCTTGGTACG AGGACGGCAA ACCTACCGAT CTGCCGGAAG ATTTTTACTC CTCCGAGTTT
CTCATCGACA AGATGATCGA ATATCTGGAG GAGGACAAAG ACGAAGGGCG TCCGTTCTTC
TCCTATGTGG CCTTTCAGGC GATCCACATT CCGCTGCAGG CACCGCGCGA ATTCGTCGAA
AAGTATGAAG GTGTCTATGA TGAAGGCTGG GACGCGCTGC GTGAAGCGCG CTTCCGTGCC
GTCGTCGAGC GCGGGCTGCT GCCGGGCGGC ATGGGACTCG GACCAATGCC CGAGGGACTT
CGCGACTGGA GTGCGCGGAG TGAGGACGAA AAACGAATGC TCGCAAAGAG CATGTCCGTG
AATGCGGCGA TGCTCGAGGC GATGGATTTT CATGTCGGGC GGCTGGTCGA ATATCTGAAG
GAGACCGGCC AGTACGAAAA CACGATCTTC ATTGTGCTTT CCGACAATGG ACCCGAGCCT
GGCGATCCGT TGGCAACGCC CGGCTTTCGG CAATGGCTGT GGTGGACCGG CTACAACCGC
GACATCGAAA CGCTGGGCGA GAAGGGCTCA TATGTCTTCA TAGGACCGGA GTTTGCCAGC
GCGGCGGCTT CGCCCGGCGC CTTCTTCAAG ATGCAGGCGG GCGAGGGCGG CTTGCGGGTG
CCGTTGATCT TCGCAGGCGA CGGCATCTCC CCTGCACGGA TGACGGATGC CTTCAGCTAC
ATCACGGATA TCGCGCCAAC CATTCTCGAA CTTGCGGGTA TAGGGGCGTC CCGGGATTTC
GAGGGCACTA CGGTGCAGGC AATGACCGGC AAATCGCTGG CGCCGTTGTT GCGCGGTGAG
GCTGCACGCG TCTATGGCGA AGAAGATGTC GTCGGTATCG AGGCGGCAGG TAATGCCGCG
ATTTTCCGAG GCAGTTACAA ACTGGTGCGC AACGCGCGGC CCTACGGCGA TATGCAGTGG
TATCTCTACA ATCTCGCGAG CGATCCGGGT GAGACGAATG ATATATCGCA GGACGAGCCC
GAACTTTTCG CCGAGATGAT GGCGGAGTAC GAGGCATACG CGGCGCGTGT CGGCGTGCGC
GAAGTGCCGG AGGGATACGA CCAGACGGCG CAGCTCACCA CCAACCGCAT CCACGATCAG
GTCCGGGAGA ACTGGCCGGG CCTCGCTGTC CTCGCGCTAT TGGTGCTTGG CGGGCTTGTC
TGGTCCGGTA TCCGCTTCGG CCCAAACCTC TGGCGAAAGC TCCGCGTCTA G
 
Protein sequence
MGMNIFKSSK LAAVMLPFIA LSAQAQEARP NIVLILADDV GFSDFGVYGS EIETPNIDAL 
AARGTLFSNF HASPICAPSR AMLMTGVDSH LAGIGNLPES APLEHRGQPG YLGRLADDVV
TVASRLSAAG YRTMMTGKWH LGHGEGAHPA ARGFDRSFAL EASGADNWEK KSYLPIYDDA
PWYEDGKPTD LPEDFYSSEF LIDKMIEYLE EDKDEGRPFF SYVAFQAIHI PLQAPREFVE
KYEGVYDEGW DALREARFRA VVERGLLPGG MGLGPMPEGL RDWSARSEDE KRMLAKSMSV
NAAMLEAMDF HVGRLVEYLK ETGQYENTIF IVLSDNGPEP GDPLATPGFR QWLWWTGYNR
DIETLGEKGS YVFIGPEFAS AAASPGAFFK MQAGEGGLRV PLIFAGDGIS PARMTDAFSY
ITDIAPTILE LAGIGASRDF EGTTVQAMTG KSLAPLLRGE AARVYGEEDV VGIEAAGNAA
IFRGSYKLVR NARPYGDMQW YLYNLASDPG ETNDISQDEP ELFAEMMAEY EAYAARVGVR
EVPEGYDQTA QLTTNRIHDQ VRENWPGLAV LALLVLGGLV WSGIRFGPNL WRKLRV