Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Haur_1930 |
Symbol | |
ID | 5733819 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Herpetosiphon aurantiacus ATCC 23779 |
Kingdom | Bacteria |
Replicon accession | NC_009972 |
Strand | + |
Start bp | 2336727 |
End bp | 2339702 |
Gene Length | 2976 bp |
Protein Length | 991 aa |
Translation table | 11 |
GC content | 53% |
IMG OID | 641279074 |
Product | transcriptional activator domain-containing protein |
Protein accession | YP_001544701 |
Protein GI | 159898454 |
COG category | [R] General function prediction only [T] Signal transduction mechanisms |
COG ID | [COG3629] DNA-binding transcriptional activator of the SARP family [COG3903] Predicted ATPase |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 6 |
Plasmid unclonability p-value | 0.816243 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGCAGTTTT GCTTTTTTGG CCTTGTGCGT GCCACATTCA ACGATCAACC GCTGCAATTT CGCTCGAATA AAGTTCGGGC ACTTTTGGCG ATTATCTGGC TTGATCCGCG TCGCCAGTGG GCACGAGATG AGCTGGCAAG CTTGCTTTGG GAAGATTATT CGAGCAGCAA AGCTCGCACT AATTTGCGCG TAACCTTTTC GCAACTACAA CAAAGCCTGC ATCCGTTGTA TCAGGCCTTG CCTGCGCGTG CGCCGTTGTT GGTCGCTGAT CGGGCACGGA TTGCGCTGCA AGCCGAAGCA TGGCCCGAAT TGCAGGTTGA TGTCTGGCGG TTTGATCAAG CAATTGCCGC CTACCACACT CATAACCATG AGCATGGAGT GATATGTGCT GCTTGTCTTG AACAATTAAC GAGCGCAGTT GAGTTGTATC AGGGCGATTT GTTGACTCCC TTTGCGATTG AGGCGAGTAG TGGCTTTGAT GCGTGGCTGG AACGCCAGCG CGAGGCACGA CATCAACAAC TGCTGCTGGC CTACGATGTT TTAGCCGAGC ATGCCCTGCG AATGCGCGAT TTTCATGAAG TTCAGCGGCT TTCAACCGCC CAATTGCTGC ATTTACCTTG GCATGAGCAG GCTCATCGGC GCTTGATGAT TAGTTTTGCC GAGTTGGGCC ATCAGAGTAT GCTGCGCGAA CAGTATTTGC TCTGCCAACG AACGCTTGAG CGCGAACTGG GGATTGGCCC TGATGCGGAG ACCCAAGCCC TTTACCAGCG TTTAGTCACA ACTTCGACCA CGCTGCCCCT GCCTAATCCA ACTCTAAAAA CCTTGCCTGA ACTAAGCAAA CCATTGATTG GCCGAACCCA TGAATTAGCG CTCTTGCACA GCTTGATTGA GCAAAAACAG CAACGTTTGG TGACCTTGCT CGGCTTGGGT GGCATCGGCA AAACAAGCTT AGCCTTGGCC TATGCTCATG CTGCCCAAGC GGCTTTCGAT GCTGTGTGGT TTGTGAGTTT TGCGGGTAGT GCTGGCGAAT CACTCGCATC CGACCACCAT CGGCTTAGCG CAACCATCGC GACGACACTG GGGTTGTCGC AACAACTGCA CACGCCCCAA GCAGCCTTGC TACATTACCT TGGGCAGCGC TCGGTCTTGT TGGTTTTGGA TAATCTTGAG CATTTGGTGC ACGAGGCGCT GCATGTGCAA GCAATTCTGG ATGCTTGCCC GCATGTGGTG GTGCTGGTAA CTTCACGCGA ACCACTCAAT ATTCAAGCTG AGCAGCGGGT GCAACTGCAT GGTTTGGCTT TGGCAAATGC TGATCAAGCG TTCGCAGCCA GTGCTCAATT ATTTCTGGCA CATGGAACCA ACGCTACCAG CCAAACCTTG GCCGATCCTG CCAGCATGGA ATGGATCGAT CGGATTTGTC GCATGCTTGA TGGTAATCCA TTGGCAATTG AATTGGCCGC CCGTTGGGTG CACTATCTTG GGTTGGATGA AATTGCCACG GCGATTGAGC AGGATATGGA TTTTCTACAA ACTAGCGTGC GTGATTTGCC CGACCGTCAT CGCAGTATGC GAGCGGTGTT TGATGGCTCG TGGCGCTTGC TTTCGCGTCA TGAACAGCGC GTGCTCAGTC AGGCTAGTTT GTTTCGGGGC AGTTGGTCGC TCACGGCGAT GCGCAGTATT TGCACGGTGT CGCGGCTGAC CATTCGCGAG CTGATCGATA AAAGTTGGCT AACTCAGCAT GCAGGCCGCG CGATGATCCA TCGTTTGATT CAACGCTATG CCCATGAACA ATTGCAACGC ATGCCGACTA CCGCTGCAAC GACTGCAAAA CGCCATAGTA TTTACTACTT GGCCATGCTT CGGCGGCATA CTCCCGCACT CAGTGGCTCG CAACCACAAG CGAGCGTTCG ACTGTTACGC GACGATCTTG ATAATATTCG TCAAGCTTGG CTTTGGGCGG TAGATCATGG CGCAACCCAT TTGGTGCATG CAAGTCTGGC TGGTTTATCG CAACTCTACG ATCTTTTAGG CTTGTATCAC GAGGCGATTC GCGTGCTGCA AACCAGCATC ACCCAGATTC AACGCCAACC AACCAGCCTA CAGCAACAAC GCCTGCTGAT GCGCTTGACG ATTGCGCTCG CCAGCCATTT CAATGCAGCG GCAGATTATC GGACTGCCGC ACAGGTTGGG CAACAGGCCT TGGATTTAGC TCAGACCCTT GATGTTCCGC AGCAGATAGC CGCCTGTTTG TTGCAAATTG GTATTGCCCA ACGCAATTAT GGTTTGTTTG CTGACGCCGA ACAAAGCCTT CAGCGCTCGA TCGCGATTGC CCAAGCGTTG CCGTTGAAGC AGGTCTATGC CCATGCGCTG CGTAGTTTGG GCTTTTTGGC CTATCTCCAA GGCAATTATC CCGTGGCGTT GCGCTACCAC GAGCAGGCGT TGGCGTTTTA TCGGTTGCTT CACGATCAAC GGAGTATTAA TTTGGCCCAA AATACCTTGG GCTTAATTGC GCTTGCTCAG GGCGATCTTC AGCAGGCGTG GGACCAATTG GCGGTAGTAT TGACGCGTTG TCAAACGCTT GAAGATGGTT GGGGCGAGGC TTTGACCCTG AATAATCTGG GAGCTGTGGT GCAGGCTCAG GGTGATCCAG CTCGCGCGAT CGATTATTAT CAAAAGGCAT TAAGCATTCG CCAACGGATT GGCGATCGTT GGGGCGAGGG CATTAGTCTG AGCAATCTGG CGGCGGCATG GCACGAGCAG GCAGATTACC AAACTGCCTA TCAAAGCGCA ATGCAAGCCA TTCAGCATAC CACTGCAATT CATGATTTGC CGACCAAAGC CTATGCCTTG ACCACCCTTA GCCAGATTCT GCGGGTACTG GGCGATCAGG CTGGGTCAAT CGCCGCCCGA TCAGAGGCCA CAACATTGCG CACCCAGCTT GGACAAGATC ATTTAATTAG TCAAACCATT GATTAA
|
Protein sequence | MQFCFFGLVR ATFNDQPLQF RSNKVRALLA IIWLDPRRQW ARDELASLLW EDYSSSKART NLRVTFSQLQ QSLHPLYQAL PARAPLLVAD RARIALQAEA WPELQVDVWR FDQAIAAYHT HNHEHGVICA ACLEQLTSAV ELYQGDLLTP FAIEASSGFD AWLERQREAR HQQLLLAYDV LAEHALRMRD FHEVQRLSTA QLLHLPWHEQ AHRRLMISFA ELGHQSMLRE QYLLCQRTLE RELGIGPDAE TQALYQRLVT TSTTLPLPNP TLKTLPELSK PLIGRTHELA LLHSLIEQKQ QRLVTLLGLG GIGKTSLALA YAHAAQAAFD AVWFVSFAGS AGESLASDHH RLSATIATTL GLSQQLHTPQ AALLHYLGQR SVLLVLDNLE HLVHEALHVQ AILDACPHVV VLVTSREPLN IQAEQRVQLH GLALANADQA FAASAQLFLA HGTNATSQTL ADPASMEWID RICRMLDGNP LAIELAARWV HYLGLDEIAT AIEQDMDFLQ TSVRDLPDRH RSMRAVFDGS WRLLSRHEQR VLSQASLFRG SWSLTAMRSI CTVSRLTIRE LIDKSWLTQH AGRAMIHRLI QRYAHEQLQR MPTTAATTAK RHSIYYLAML RRHTPALSGS QPQASVRLLR DDLDNIRQAW LWAVDHGATH LVHASLAGLS QLYDLLGLYH EAIRVLQTSI TQIQRQPTSL QQQRLLMRLT IALASHFNAA ADYRTAAQVG QQALDLAQTL DVPQQIAACL LQIGIAQRNY GLFADAEQSL QRSIAIAQAL PLKQVYAHAL RSLGFLAYLQ GNYPVALRYH EQALAFYRLL HDQRSINLAQ NTLGLIALAQ GDLQQAWDQL AVVLTRCQTL EDGWGEALTL NNLGAVVQAQ GDPARAIDYY QKALSIRQRI GDRWGEGISL SNLAAAWHEQ ADYQTAYQSA MQAIQHTTAI HDLPTKAYAL TTLSQILRVL GDQAGSIAAR SEATTLRTQL GQDHLISQTI D
|
| |