Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Hoch_3685 |
Symbol | |
ID | 8546075 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Haliangium ochraceum DSM 14365 |
Kingdom | Bacteria |
Replicon accession | NC_013440 |
Strand | - |
Start bp | 5068654 |
End bp | 5071683 |
Gene Length | 3030 bp |
Protein Length | 1009 aa |
Translation table | 11 |
GC content | 72% |
IMG OID | 646388353 |
Product | sulfatase |
Protein accession | YP_003268079 |
Protein GI | 262196870 |
COG category | [P] Inorganic ion transport and metabolism |
COG ID | [COG3119] Arylsulfatase A and related enzymes |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 3 |
Plasmid unclonability p-value | 0.0329595 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | 8 |
Fosmid unclonability p-value | 0.00309505 |
Fosmid Hitchhiker | Yes |
Fosmid clonability | hitchhiker |
| |
Sequence |
Gene sequence | GTGGAATCCG CATCTCCAGC CCCTGCCCCG CGCCTCTCGG CGCTCCTCGC CGGCGACCTG GGCCGCGTCG CCCGCGTGCT CGTCCGCGCC GTCCTGGTGC TCGCCGGCTG CGAGCTCGCG CTCGCGCTCA TCACCACCCC GGCGCCCGAG GTCAGCTTCG GCGCCGGCCT GCGCCTGGCG CTGATCTCGC TGTCGCTGCT GCTGCTCTTG TGGCTGGCGC TGGTGCCGCT GGTGTTCGCC GGCGGCGCCT GCGTCCGCCT GGCGCTGTTT TTGTACGATC GTCCGCGCTG CTATCGCTGG CCCGGACCGC TGGCCGCGGG CTCGGCCGCC GCGCAGCCGC TGGCGCCGTG GCTGTGGGCC GCGTTCGCGG CCGCGCTGCC CTACCTCGCG GCCTCGTCCT ACGCCACCTT CCGCGCCACC ACCTACTTCA AAGAGCCGCA GCTCACCGGC CTGCTGCTGG CCATCGGCCA GCTCGTGCTG TTCGCGCTCG TCGGCCTGCT GGCGGCCGCC TTGGCGCCGG CGCTGGGCCG TCTCGGACGC GCCATCGACG CCCGCCTGGG CCGGCGCGAG GCGCCCTGGG CCCAGCGCCT GCGGCGGCTC AATCCCTTCG CCCGCGTGGG CCCGGCCCTG GTCTGCTTGC TGGTGCTGTG CCTGCCCGCG ATCCTGCTGG CCGTGGTGAT CATGCCGCAG CTCGCGCCCA ACGTGCCGTG GCGGCTGCTG GTGTCGCTGC TGCTGCTGCT GGCCGCGTGC GTGGCCTTTG CCGAGCCCGC GCCGGGCGCC GATGACGATA CCGATGTTGA TGCCGACGCC GATGAACGCG CCGGCACGTC GCGACGCGCG CGGCGCCTGC GCACCCTGGC CCTGAGCGCG AGCGCCGCCG CGCTGGTGGC CGCCACCCTG CTGTGGATCG GCGCCGACCA CGAGGCCCGC TCGGTGGCCA CCTCGGGCTC GCCGCTGTTG TCCTCGCTCA GCGACGGCGT GCGCCGCGGC ACCGACTTCG ACGGCGACGG CTTCGGCTTC CTGCTGGGCG AGAACGACTG CGCGCCGCTG TCGGCCAACA TCCACCCGAT GGCGCGCGAC ATCCCCGGCA ACGGCATCGA CGAGGACTGC AACGGCCGCG ACTTCCAGTT CCGCCCGCGG GCGCCTCGTC GCGCCGGGGC CCGCGCCGAG GTGCCGGCCG AGTTTCTGCG CGACTGGAAC ATCCTGCTGC TCACGGTCGA CACCGTGCGC TACGACCACA CCGGCTTCGG CGGCTACATC GAGACCAGCG GGCGCGACAC CACGCCCAAC CTCGACCAGC TCGTCGAGCG CTCGCTGTCC TTCGACTTCA CCCAGGCGCC CTCGGCCGGC ACCATGGCCT CGATCCCGGC CATCCTCACC TCGAAGTTCT TCCACTCGGG CATCGCGCTC GACGAGGACG TGCCGCCGCG CTCGCCGCCG CGGCTCAAGC CCGAAAATCT GCTCATCAGC GAGATCGCCA AGGCCAGCGG CAAGGCCACG GGCGCCATCC TCACCCACCC GTACTTCAAC GACTGGGGCA TGGAGCAGGG CGTCGACAGC TACGACAACG AGATCGGCGC CGAGTACCAG CCGCGCCGCG TGACCTCGCA CGAGCTCACC GACAAGGCCA TCACCTGGAT CGCCCAGCAC GAGGACCAGC CCTGGTTTCT GTGGCTGCAC TATCTCGACC CGCACGGCTA CTACGTCCCC CACCCCGGCG AGCCCAGCTT CGGCGAGACC GAAGAGGACC TCTACGACGG CGAGCTGCGC TACACCGACA AGCACCTGGG CCGGCTCTTC AAAGCCCTGG GCAACATGCC CGAGGTCGCC GACCGCACCA TCATCATCCT CACCAGCGAC CACGGCGACG GCTTCGGCGA GCACGGCTTC ATCAACCACG GCCAGGCCCT GTACCGCGAG CTCCTGCACG TGCCGCTGCT CATCCACGTG CCCGGCGTGC CCGCGCGGCG CGTCGGCGGC GCGACCACGC CGCTCGACAT CGTGCCCACG GTCGCCGACC TGCTCGGCTA CGAGTACGAC CCGCTGCAGT TCGAGGGCGA GAGCCTGGTG CCCCAGATCT TCTACGGCGA GACCGACGAG GCGCGCGTGG TGTTCGCCGA GACCGACTGG CCGCAGCCGC TGCGCGCTGC GATCTCGGCG CAGTACAAGC TGATCTTCAA GATCAAGCGC AACCTCTACG AGCTCTACGA CCTGGACAAG GACCCCTGGG AGAAGAAGAA CATCGCCGTG CGCGACAAGG ACGCGCTCGA GCAGATGAAG GGCCACCTCG ACGAGTGGCT CGAGCGCGTC TACTACGCCC GCGACGCCAC CAGCAACCAG CAGATCTACA AGCTGCGCGA CGTCCTCCTG AGCGAGGCGC CCACGCCCGC CCATCCGCTC ACCGGCGCGA GCTTCGGCGA AGGCGGGCAG ATCGCCGTCC TCGGCTCCGG GTTCCGCGAC GCCCCGCTGC AGCCGGGTGA GAAGCTCGAG GTCTCGGTGT ACTTCCACGT GCCCGGCGAG CGCCCGAGCG AGGACTACGA GTTTCAGGTC GAGATGTGGC CGGCGGCCCA AGGCGACAAG GTTGGGAGCG ACACGGAGGG CGATGACGCC GCCGACGCCG CCGAGGCAGG CAAGGACAAG GCCGGCAAAT CCGCGAAGAG CCGGCGGGGG ACGCGCTCGC GGCGCCAGTC CACGGCCCGC AGCGGCGTGC GCAAGGGCGG CGGCGGCGTG TTGCCGACCT CGCGCTGGCG CAGCGGCGAG TATCTACAGC AGCGCTTCGA GCTGGTCGTG CCCCAGACCT GGGAGCAGGG CGGCGAGGCC ATGCTCGGGT TGCGCGTGCG CGCCTCGGGC TCGGGCGGCT GGCTCCACTT CGACGAGGCG CTGAGCCGCA GCGCCGACCC GGAGATGCTG GTGTTGGCGC GCGCGCCCCT GGCCGCCCAG GACGCGGGAG ACGCGGATAA CGCGGACGAC GAGGCCGCCG CCGATGACGC GGCCGCCGAA GCTGCCGGCG CCGCCGAAGC CGCCGAGTAG
|
Protein sequence | MESASPAPAP RLSALLAGDL GRVARVLVRA VLVLAGCELA LALITTPAPE VSFGAGLRLA LISLSLLLLL WLALVPLVFA GGACVRLALF LYDRPRCYRW PGPLAAGSAA AQPLAPWLWA AFAAALPYLA ASSYATFRAT TYFKEPQLTG LLLAIGQLVL FALVGLLAAA LAPALGRLGR AIDARLGRRE APWAQRLRRL NPFARVGPAL VCLLVLCLPA ILLAVVIMPQ LAPNVPWRLL VSLLLLLAAC VAFAEPAPGA DDDTDVDADA DERAGTSRRA RRLRTLALSA SAAALVAATL LWIGADHEAR SVATSGSPLL SSLSDGVRRG TDFDGDGFGF LLGENDCAPL SANIHPMARD IPGNGIDEDC NGRDFQFRPR APRRAGARAE VPAEFLRDWN ILLLTVDTVR YDHTGFGGYI ETSGRDTTPN LDQLVERSLS FDFTQAPSAG TMASIPAILT SKFFHSGIAL DEDVPPRSPP RLKPENLLIS EIAKASGKAT GAILTHPYFN DWGMEQGVDS YDNEIGAEYQ PRRVTSHELT DKAITWIAQH EDQPWFLWLH YLDPHGYYVP HPGEPSFGET EEDLYDGELR YTDKHLGRLF KALGNMPEVA DRTIIILTSD HGDGFGEHGF INHGQALYRE LLHVPLLIHV PGVPARRVGG ATTPLDIVPT VADLLGYEYD PLQFEGESLV PQIFYGETDE ARVVFAETDW PQPLRAAISA QYKLIFKIKR NLYELYDLDK DPWEKKNIAV RDKDALEQMK GHLDEWLERV YYARDATSNQ QIYKLRDVLL SEAPTPAHPL TGASFGEGGQ IAVLGSGFRD APLQPGEKLE VSVYFHVPGE RPSEDYEFQV EMWPAAQGDK VGSDTEGDDA ADAAEAGKDK AGKSAKSRRG TRSRRQSTAR SGVRKGGGGV LPTSRWRSGE YLQQRFELVV PQTWEQGGEA MLGLRVRASG SGGWLHFDEA LSRSADPEML VLARAPLAAQ DAGDADNADD EAAADDAAAE AAGAAEAAE
|
| |