Gene Hoch_3685 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagHoch_3685 
Symbol 
ID8546075 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameHaliangium ochraceum DSM 14365 
KingdomBacteria 
Replicon accessionNC_013440 
Strand
Start bp5068654 
End bp5071683 
Gene Length3030 bp 
Protein Length1009 aa 
Translation table11 
GC content72% 
IMG OID646388353 
Productsulfatase 
Protein accessionYP_003268079 
Protein GI262196870 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.0329595 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones
Fosmid unclonability p-value0.00309505 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
GTGGAATCCG CATCTCCAGC CCCTGCCCCG CGCCTCTCGG CGCTCCTCGC CGGCGACCTG 
GGCCGCGTCG CCCGCGTGCT CGTCCGCGCC GTCCTGGTGC TCGCCGGCTG CGAGCTCGCG
CTCGCGCTCA TCACCACCCC GGCGCCCGAG GTCAGCTTCG GCGCCGGCCT GCGCCTGGCG
CTGATCTCGC TGTCGCTGCT GCTGCTCTTG TGGCTGGCGC TGGTGCCGCT GGTGTTCGCC
GGCGGCGCCT GCGTCCGCCT GGCGCTGTTT TTGTACGATC GTCCGCGCTG CTATCGCTGG
CCCGGACCGC TGGCCGCGGG CTCGGCCGCC GCGCAGCCGC TGGCGCCGTG GCTGTGGGCC
GCGTTCGCGG CCGCGCTGCC CTACCTCGCG GCCTCGTCCT ACGCCACCTT CCGCGCCACC
ACCTACTTCA AAGAGCCGCA GCTCACCGGC CTGCTGCTGG CCATCGGCCA GCTCGTGCTG
TTCGCGCTCG TCGGCCTGCT GGCGGCCGCC TTGGCGCCGG CGCTGGGCCG TCTCGGACGC
GCCATCGACG CCCGCCTGGG CCGGCGCGAG GCGCCCTGGG CCCAGCGCCT GCGGCGGCTC
AATCCCTTCG CCCGCGTGGG CCCGGCCCTG GTCTGCTTGC TGGTGCTGTG CCTGCCCGCG
ATCCTGCTGG CCGTGGTGAT CATGCCGCAG CTCGCGCCCA ACGTGCCGTG GCGGCTGCTG
GTGTCGCTGC TGCTGCTGCT GGCCGCGTGC GTGGCCTTTG CCGAGCCCGC GCCGGGCGCC
GATGACGATA CCGATGTTGA TGCCGACGCC GATGAACGCG CCGGCACGTC GCGACGCGCG
CGGCGCCTGC GCACCCTGGC CCTGAGCGCG AGCGCCGCCG CGCTGGTGGC CGCCACCCTG
CTGTGGATCG GCGCCGACCA CGAGGCCCGC TCGGTGGCCA CCTCGGGCTC GCCGCTGTTG
TCCTCGCTCA GCGACGGCGT GCGCCGCGGC ACCGACTTCG ACGGCGACGG CTTCGGCTTC
CTGCTGGGCG AGAACGACTG CGCGCCGCTG TCGGCCAACA TCCACCCGAT GGCGCGCGAC
ATCCCCGGCA ACGGCATCGA CGAGGACTGC AACGGCCGCG ACTTCCAGTT CCGCCCGCGG
GCGCCTCGTC GCGCCGGGGC CCGCGCCGAG GTGCCGGCCG AGTTTCTGCG CGACTGGAAC
ATCCTGCTGC TCACGGTCGA CACCGTGCGC TACGACCACA CCGGCTTCGG CGGCTACATC
GAGACCAGCG GGCGCGACAC CACGCCCAAC CTCGACCAGC TCGTCGAGCG CTCGCTGTCC
TTCGACTTCA CCCAGGCGCC CTCGGCCGGC ACCATGGCCT CGATCCCGGC CATCCTCACC
TCGAAGTTCT TCCACTCGGG CATCGCGCTC GACGAGGACG TGCCGCCGCG CTCGCCGCCG
CGGCTCAAGC CCGAAAATCT GCTCATCAGC GAGATCGCCA AGGCCAGCGG CAAGGCCACG
GGCGCCATCC TCACCCACCC GTACTTCAAC GACTGGGGCA TGGAGCAGGG CGTCGACAGC
TACGACAACG AGATCGGCGC CGAGTACCAG CCGCGCCGCG TGACCTCGCA CGAGCTCACC
GACAAGGCCA TCACCTGGAT CGCCCAGCAC GAGGACCAGC CCTGGTTTCT GTGGCTGCAC
TATCTCGACC CGCACGGCTA CTACGTCCCC CACCCCGGCG AGCCCAGCTT CGGCGAGACC
GAAGAGGACC TCTACGACGG CGAGCTGCGC TACACCGACA AGCACCTGGG CCGGCTCTTC
AAAGCCCTGG GCAACATGCC CGAGGTCGCC GACCGCACCA TCATCATCCT CACCAGCGAC
CACGGCGACG GCTTCGGCGA GCACGGCTTC ATCAACCACG GCCAGGCCCT GTACCGCGAG
CTCCTGCACG TGCCGCTGCT CATCCACGTG CCCGGCGTGC CCGCGCGGCG CGTCGGCGGC
GCGACCACGC CGCTCGACAT CGTGCCCACG GTCGCCGACC TGCTCGGCTA CGAGTACGAC
CCGCTGCAGT TCGAGGGCGA GAGCCTGGTG CCCCAGATCT TCTACGGCGA GACCGACGAG
GCGCGCGTGG TGTTCGCCGA GACCGACTGG CCGCAGCCGC TGCGCGCTGC GATCTCGGCG
CAGTACAAGC TGATCTTCAA GATCAAGCGC AACCTCTACG AGCTCTACGA CCTGGACAAG
GACCCCTGGG AGAAGAAGAA CATCGCCGTG CGCGACAAGG ACGCGCTCGA GCAGATGAAG
GGCCACCTCG ACGAGTGGCT CGAGCGCGTC TACTACGCCC GCGACGCCAC CAGCAACCAG
CAGATCTACA AGCTGCGCGA CGTCCTCCTG AGCGAGGCGC CCACGCCCGC CCATCCGCTC
ACCGGCGCGA GCTTCGGCGA AGGCGGGCAG ATCGCCGTCC TCGGCTCCGG GTTCCGCGAC
GCCCCGCTGC AGCCGGGTGA GAAGCTCGAG GTCTCGGTGT ACTTCCACGT GCCCGGCGAG
CGCCCGAGCG AGGACTACGA GTTTCAGGTC GAGATGTGGC CGGCGGCCCA AGGCGACAAG
GTTGGGAGCG ACACGGAGGG CGATGACGCC GCCGACGCCG CCGAGGCAGG CAAGGACAAG
GCCGGCAAAT CCGCGAAGAG CCGGCGGGGG ACGCGCTCGC GGCGCCAGTC CACGGCCCGC
AGCGGCGTGC GCAAGGGCGG CGGCGGCGTG TTGCCGACCT CGCGCTGGCG CAGCGGCGAG
TATCTACAGC AGCGCTTCGA GCTGGTCGTG CCCCAGACCT GGGAGCAGGG CGGCGAGGCC
ATGCTCGGGT TGCGCGTGCG CGCCTCGGGC TCGGGCGGCT GGCTCCACTT CGACGAGGCG
CTGAGCCGCA GCGCCGACCC GGAGATGCTG GTGTTGGCGC GCGCGCCCCT GGCCGCCCAG
GACGCGGGAG ACGCGGATAA CGCGGACGAC GAGGCCGCCG CCGATGACGC GGCCGCCGAA
GCTGCCGGCG CCGCCGAAGC CGCCGAGTAG
 
Protein sequence
MESASPAPAP RLSALLAGDL GRVARVLVRA VLVLAGCELA LALITTPAPE VSFGAGLRLA 
LISLSLLLLL WLALVPLVFA GGACVRLALF LYDRPRCYRW PGPLAAGSAA AQPLAPWLWA
AFAAALPYLA ASSYATFRAT TYFKEPQLTG LLLAIGQLVL FALVGLLAAA LAPALGRLGR
AIDARLGRRE APWAQRLRRL NPFARVGPAL VCLLVLCLPA ILLAVVIMPQ LAPNVPWRLL
VSLLLLLAAC VAFAEPAPGA DDDTDVDADA DERAGTSRRA RRLRTLALSA SAAALVAATL
LWIGADHEAR SVATSGSPLL SSLSDGVRRG TDFDGDGFGF LLGENDCAPL SANIHPMARD
IPGNGIDEDC NGRDFQFRPR APRRAGARAE VPAEFLRDWN ILLLTVDTVR YDHTGFGGYI
ETSGRDTTPN LDQLVERSLS FDFTQAPSAG TMASIPAILT SKFFHSGIAL DEDVPPRSPP
RLKPENLLIS EIAKASGKAT GAILTHPYFN DWGMEQGVDS YDNEIGAEYQ PRRVTSHELT
DKAITWIAQH EDQPWFLWLH YLDPHGYYVP HPGEPSFGET EEDLYDGELR YTDKHLGRLF
KALGNMPEVA DRTIIILTSD HGDGFGEHGF INHGQALYRE LLHVPLLIHV PGVPARRVGG
ATTPLDIVPT VADLLGYEYD PLQFEGESLV PQIFYGETDE ARVVFAETDW PQPLRAAISA
QYKLIFKIKR NLYELYDLDK DPWEKKNIAV RDKDALEQMK GHLDEWLERV YYARDATSNQ
QIYKLRDVLL SEAPTPAHPL TGASFGEGGQ IAVLGSGFRD APLQPGEKLE VSVYFHVPGE
RPSEDYEFQV EMWPAAQGDK VGSDTEGDDA ADAAEAGKDK AGKSAKSRRG TRSRRQSTAR
SGVRKGGGGV LPTSRWRSGE YLQQRFELVV PQTWEQGGEA MLGLRVRASG SGGWLHFDEA
LSRSADPEML VLARAPLAAQ DAGDADNADD EAAADDAAAE AAGAAEAAE