Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Haur_2512 |
Symbol | |
ID | 5734390 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Herpetosiphon aurantiacus ATCC 23779 |
Kingdom | Bacteria |
Replicon accession | NC_009972 |
Strand | + |
Start bp | 3208455 |
End bp | 3210368 |
Gene Length | 1914 bp |
Protein Length | 637 aa |
Translation table | 11 |
GC content | 53% |
IMG OID | 641279652 |
Product | cellulose-binding family II protein |
Protein accession | YP_001545278 |
Protein GI | 159899031 |
COG category | [G] Carbohydrate transport and metabolism |
COG ID | [COG5297] Cellobiohydrolase A (1,4-beta-cellobiosidase A) |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 11 |
Plasmid unclonability p-value | 1 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGAAAGTTT ATAAGCCAAT GCTCAAATTC ATTGCAGCTT TTTTGGTCTT TTTACCGCTC GTCACCAGTG CTGCCAGCAG CACCGAGCAC TTTAATCAAC CAATCAGTAG TTCCAGTATG TTTAGCTGGT ACACCAACTC CAGCACCGCC ACGGGTACAG CCAACGTCAG CGATAGCCAA GCCCAAGATG GCAAGATTCT GCGGCTTTCA ATTGCTGCTG GACAAACTGC TTACCCAGGT CAAGGCTCAA ACTTGGTTTC ACGCCAAATG TATCATTATG GCACGTATGA AGCCCGCATG AAAACCGCCA ACTGCGCCAG CGACGAAGGC GTAATCAATG GTTTCTTTAC CTATTTCAAC GATGGCAGCG ATAGTAACGC CAACGGCTTG CCCGATAACA GCGAAATTGA TTTTGAATGG CTCTGCGCCG AACCCCAATC AATCTTGATG ACGATCTGGA CCGATTACAA CGATCCAACT GCGATCAGTC GGCGGGTCTA TCGCAAGGTC AATTTGGCAA CTGGCACGAT CGAATATACC CGCTATGCCA CAACCTTCGG CGATAGCTAC ACCGATTTGT CGAATAGCCC AAGCGAAAAT CAACCAACGG CAGTACAGGC GATTTCAGGC TACAATTCAG CCACCAACTA CTATGAATAT GGCTTCAATT GGACTGCCAG CAATGTCACC TTGTGGGTGG TGAATCCTAG CAACGGCCAA AAAATTGTAC TCTGGGATTA TCGCGGGCCA AGCGCACGGA TTCCCAAAAA TCCAGCAGCC TTTATGGTCA ATTTGTGGCA TCACCCCAAT TGGACACCCG AATGCTGTGC GAACGCCACC AATCCACCAC GAGCCACTCG TTCGATGGAT GTTGATTGGC TGCGGTTCAC GCCACAAAGC GATATCCCCA GTGATACGAC CGCGCCAAGT GCTCCCAGCA ATTTGCAAGC ACCCAGCAAA ACCCACAACA GCGTGAATTT AAGTTGGAAC ACCGCCACCG ATAATGTTGG CGTAGTGGGC TACGATATTT ATCAGAATGG CGGGGCCAAT CCCGTTGCCA GCACCAGCAG CACCACCATC ACGATCAGCG GGCTGAATCC CAGCACTGCC TATAGTTTTG CGGTCAAAGC GCGTGATGCC GCTGACAATC GCTCGGCCAA CAGCAACAGC CTGAGCGTCA GCACCAACAA CCAGCCAACC ACAGGCAACG GTTTACAAGC AAGCTTCGTC AAAACCTCAG ATTGGGGCAA CGGCTACGTG GGAGTCTATC GGATTACCAA CAATGGCAGC AGCGCCGTCG ATGGCTGGAC GTTGGGCTTT GACTTACCGA GCAATGCCAC GATTTCAAGC TGGTGGGATG CGACTCAAGC CAAGAGCGGC AATCGCTACA CCGCTGGCAA TCTCGCATGG AACCGCCGAA TCGAGGTTGG CCAAACTCGC GAATTTGGCT TCAGCGGAAG CTACAGCGGC GCATGGGTCA ATCCTAGCAA CTGCACGATC AACGGCCAAG CCTGTAGTGG CGGCACAACC AGCACCAACG TGGCGCTTGG TCGGCCAGTT CAGGTTTCCT CGGTCGAAAC CAGCGCTTTG GGTGGAGCCA ATGCCACCGA TGGCAACAAC GCTACCCGTT GGGCCAGCAG CTACAGCGAT CCTCAATGGA TTCAGGTTGA CTTGGGCAGC AGCAAAAGCT TGACCAAAGT TGTGTTAAAT TGGGAAGCAG CCTATGCCCG AGCCTATCAA GTGCAAGTTT CTGATGATGC GAGTACCTGG CGCACCCTGA GCACGGTCAG CAACGGCGAT GGTGGCACTG ATACACTCAA TATCACTGGC AGCGGACGTT ATGTACGAAT TTACGCAACC CAACGCGGTA CCGAATGGGG TTACTCGTTG TGGGAAATCG CCGCCTATAA CTAG
|
Protein sequence | MKVYKPMLKF IAAFLVFLPL VTSAASSTEH FNQPISSSSM FSWYTNSSTA TGTANVSDSQ AQDGKILRLS IAAGQTAYPG QGSNLVSRQM YHYGTYEARM KTANCASDEG VINGFFTYFN DGSDSNANGL PDNSEIDFEW LCAEPQSILM TIWTDYNDPT AISRRVYRKV NLATGTIEYT RYATTFGDSY TDLSNSPSEN QPTAVQAISG YNSATNYYEY GFNWTASNVT LWVVNPSNGQ KIVLWDYRGP SARIPKNPAA FMVNLWHHPN WTPECCANAT NPPRATRSMD VDWLRFTPQS DIPSDTTAPS APSNLQAPSK THNSVNLSWN TATDNVGVVG YDIYQNGGAN PVASTSSTTI TISGLNPSTA YSFAVKARDA ADNRSANSNS LSVSTNNQPT TGNGLQASFV KTSDWGNGYV GVYRITNNGS SAVDGWTLGF DLPSNATISS WWDATQAKSG NRYTAGNLAW NRRIEVGQTR EFGFSGSYSG AWVNPSNCTI NGQACSGGTT STNVALGRPV QVSSVETSAL GGANATDGNN ATRWASSYSD PQWIQVDLGS SKSLTKVVLN WEAAYARAYQ VQVSDDASTW RTLSTVSNGD GGTDTLNITG SGRYVRIYAT QRGTEWGYSL WEIAAYN
|
| |