Gene Hlac_1597 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagHlac_1597 
Symbol 
ID7399546 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameHalorubrum lacusprofundi ATCC 49239 
KingdomArchaea 
Replicon accessionNC_012029 
Strand
Start bp1614554 
End bp1616122 
Gene Length1569 bp 
Protein Length522 aa 
Translation table11 
GC content68% 
IMG OID643708664 
Productsulfatase 
Protein accessionYP_002566253 
Protein GI222480016 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones16 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones12 
Fosmid unclonability p-value0.00665248 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
ATGGTATCGG ACACGTCGAC CTCGGACACA GCGGTATCGA ACCCAACGTC GTCGGATTCA 
ACCGATTCAG ATTCAACCGA TTCAGATTCA ACCGGTTCGG ATTCGACGGA CTCGACCTCA
ACATCGGACG GAGCGGTATC AAACGTCCTC CTCGTCACGA TCGATTCGCT CCGGGCGGAC
GCGATCGGTC CCTACGACAA CGATCGATAT TCCCCGGTGC TCTCGGATCT CGCCGCCGAC
GGAACCGTCT TCGATCGGTC GTTCGCGACC GGCAACTGGA CGCCCTTCTC GTTCCCCTCG
ATCCTCGCCT CCGAGCCCGT CTTCGCCCGA AACGGCGACA TCGGTGTGAC GGGCGCTCGC
ACGCTCGCGT CGGTGCTCTC CGAGGCCGGA ATCGCGACCG GCGGCTTCAA CGCCGCCAAC
GGCTTCCTCA CCTCTCACTG GGGGTATCCC GAGGGGTTCG ACGAGTTCGA GCCGTTCGTC
ACGAGCGTGG GATCGAGCCG GTACAGTCGG TACCTCGCGG CCCACCCGAC GGTCGAGGCG
TGGATCCAAC TCGCCACGTC GCCGTTCCGC CGTCTCGGCT CCAAGCTCCG GGGCGAGAGC
GACGATCGTC CCTTCCTCGA CGCCTCGCGG ATGTTCGACG TCGAGGACTC CGCGACCGAG
TTTGTCGACG ACACCGACGA GCCGTTCTTC CTGTGGGTCC ACTACATGGA CACCCACACC
CCGTACGTCC CCGCCCCGCG GTACATCCGC GAGGTCTCCG ACGGGCTGAT CGGCACCCAC
CGGATGCTCC ACGCCCACAC GCGCACGAGC CTCGGCTGGG AGGTCGGCGA GCGGACCCTC
GGCGACCTCC GAACCCTCTA CCAGGCCACG GTGCGACAGG TCGACGCCAG CGTCGGGCGC
CTGCTCGACA CGCTTGAGGC GGCCGGGATC GCCGACGAGA CCGCGATCGT CGTCGCCGGC
GACCACGGCG AGGAGTTCCA GGAACACGGC CACCTCGCGC ACTACCCGAA GCTGTACGAC
GAGCTGATCC ACGTGCCGCT CATCGTGAAC GTCCCCGGCG AGGACGGCGG TCGCCGCGTG
TCCGAACACG TCGGGCTCGA CGCGATTCCG CCGACCGTCG CCGACCTGCT CGACGTCGAA
TCGCCGCCGG AGTGGCGCGG CGAATCCCTC GAACCGGCGG TCAGTGGCGG CGAGTCGCCG
GATCAGGAGC CCGTCGTCTC GGTCACCGTT CGGGGAGAGG AGGTGACCGA ACAGCCGATC
CCGCGATCGC TTTCCGACGG CGACCTCCTC GTGAGCGTCC GCGACGCCGA GTGGACGTAC
ATCGAGAACG CGGACACGGC GGAGACGGAG CTGTACCACC GACCCTCGGA CCCGACTCAG
CAGGAGGATC TGTCGGCGGA CCCGTCCGAC GAGGCGCTCG CGGTCGTCGA GCGGTTCGCG
CCGATCGTCG CGGACCACGT CGCCGAACTT CGCGACAGAC AGACGGACGC GGAGGCGGCC
GACGACGGCG AGGACGAGGA GGTCGACGAG CACCTCGAGG CCCGCCTCGA AGCGCTCGGC
TATCGGTGA
 
Protein sequence
MVSDTSTSDT AVSNPTSSDS TDSDSTDSDS TGSDSTDSTS TSDGAVSNVL LVTIDSLRAD 
AIGPYDNDRY SPVLSDLAAD GTVFDRSFAT GNWTPFSFPS ILASEPVFAR NGDIGVTGAR
TLASVLSEAG IATGGFNAAN GFLTSHWGYP EGFDEFEPFV TSVGSSRYSR YLAAHPTVEA
WIQLATSPFR RLGSKLRGES DDRPFLDASR MFDVEDSATE FVDDTDEPFF LWVHYMDTHT
PYVPAPRYIR EVSDGLIGTH RMLHAHTRTS LGWEVGERTL GDLRTLYQAT VRQVDASVGR
LLDTLEAAGI ADETAIVVAG DHGEEFQEHG HLAHYPKLYD ELIHVPLIVN VPGEDGGRRV
SEHVGLDAIP PTVADLLDVE SPPEWRGESL EPAVSGGESP DQEPVVSVTV RGEEVTEQPI
PRSLSDGDLL VSVRDAEWTY IENADTAETE LYHRPSDPTQ QEDLSADPSD EALAVVERFA
PIVADHVAEL RDRQTDAEAA DDGEDEEVDE HLEARLEALG YR