Gene Hlac_1066 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagHlac_1066 
Symbol 
ID7400138 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameHalorubrum lacusprofundi ATCC 49239 
KingdomArchaea 
Replicon accessionNC_012029 
Strand
Start bp1064885 
End bp1066240 
Gene Length1356 bp 
Protein Length451 aa 
Translation table11 
GC content53% 
IMG OID643708133 
Productsulfatase 
Protein accessionYP_002565732 
Protein GI222479495 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones21 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones24 
Fosmid unclonability p-value0.884442 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCGCGATA TCGTTTTAGT AACAGTCGAC TCGTTACGCG CCGATCACGT CGGTTGGCAC 
GGCTACGATC GGAATACAAC GCCAAATCTC GACCAGCGCG CGGCATCGGC CCAGACGTTC
ACGTCCGCCT TTGCCCACGC ATGTTCAACA CGACCTTCGT TTCCGTCTAT TATGACTTCG
TCGTACGCTC TTGAGTACGG AGGATTCGAA CGACTCTCCT CGAAACGAAC CACAATTGCC
GAACTTTTAG AAGAGGCTGA GTACGAGACT GCTGGCTTCC ACTCGAACCT CTATCTCTCT
GCTGATTTTG GCTACGATAG AGGATTCAGT CGGTTCTTTG ATTCGAAATC GGACCCAGGG
ACACTCGCTA AACTTCGACA GGAGGTCAAA ACACACCTTG ACTCCGATGG CCATCTCTAC
GGTTTTCTTC AGCAGGCGTT CAACGCAACG GAGAAACGAG CAGGTATTGA ACTCGGTTCT
GCCTACATCG ACGCTGAGGA AATCACCGAT CGTGCGCTCT CTTGGGCGTC TTCAACGAGT
AGCAATCCCC GCTTCCTTTG GGTACACTAC ATGGATGTCC ACCATCCGTA CGTCCCACCA
GCGGAGCATC AGCGGCGATT CCGCGATGAA CCTGTCACCG ACCGTGACGC CGTTCAACTT
CGGAGGAAGA TGTTGGAATC ACCGGAGGAG ATAACTGATC AGGAGTTCAA CACTCTCATT
GACCTCTATG ACTCCGAAAT CGCCTATGTC GACGCACAAG TGGATCGGCT GATCGAAACA
CTTCAGACGG AGTGGGAAAA TGATCCCGTA ATCGCGTTCA CTGCTGATCA CGGTGAGGAG
TTCCTTGACC ACGGTGGATT CAGTCACAGT GCTACCTTCC ACGACGAAGT AATTCATGTG
CCGCTGTTAG TTGGGACTGG AGAAGAAGAA ACAGGAGAAA GCGACAATCT CGTCGGCTTG
ATGGATCTGG CACCTACTCT CGCTGATAAA GCAGATGTCG ATCGACCAGG GACCTATCGG
GGTCAACCGC TGAGTCAAGT TGAGGACCAG TGGAACCGGT CAGAAGTCAT CGCCGAATGG
ACCGACACCG ACACAGATGA TCGTCGGTTT GCCGTTCGGA CCACGAACTG GAAGTATATC
CGCGAGGAAA ACGGAGATGA GCAACTTTAC GACCTTACCG CTGATCCGGA TGAGATGAAC
GATCTTGCTA CTGAGACTCT CGATGTATTA TCGGACCTCC GCGAAACGCT TGAGGACCAT
CTGGTGACGT TAGATGAAAG CCGCGAGGAC CTCGGTGACG TCGAGATGGA TGAGGAGGTG
CGCCAGCGAC TTCGCGACCT CGGATATCAG GAGTAG
 
Protein sequence
MRDIVLVTVD SLRADHVGWH GYDRNTTPNL DQRAASAQTF TSAFAHACST RPSFPSIMTS 
SYALEYGGFE RLSSKRTTIA ELLEEAEYET AGFHSNLYLS ADFGYDRGFS RFFDSKSDPG
TLAKLRQEVK THLDSDGHLY GFLQQAFNAT EKRAGIELGS AYIDAEEITD RALSWASSTS
SNPRFLWVHY MDVHHPYVPP AEHQRRFRDE PVTDRDAVQL RRKMLESPEE ITDQEFNTLI
DLYDSEIAYV DAQVDRLIET LQTEWENDPV IAFTADHGEE FLDHGGFSHS ATFHDEVIHV
PLLVGTGEEE TGESDNLVGL MDLAPTLADK ADVDRPGTYR GQPLSQVEDQ WNRSEVIAEW
TDTDTDDRRF AVRTTNWKYI REENGDEQLY DLTADPDEMN DLATETLDVL SDLRETLEDH
LVTLDESRED LGDVEMDEEV RQRLRDLGYQ E