Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Haur_2138 |
Symbol | |
ID | 5734040 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Herpetosiphon aurantiacus ATCC 23779 |
Kingdom | Bacteria |
Replicon accession | NC_009972 |
Strand | - |
Start bp | 2690675 |
End bp | 2693503 |
Gene Length | 2829 bp |
Protein Length | 942 aa |
Translation table | 11 |
GC content | 53% |
IMG OID | 641279279 |
Product | cellulose-binding family II protein |
Protein accession | YP_001544906 |
Protein GI | 159898659 |
COG category | [R] General function prediction only |
COG ID | [COG3889] Predicted solute binding protein |
TIGRFAM ID | |
| ![](https://exploration.weizmann.ac.il/pandatox/images_new/ic_cp.jpg)
![](https://exploration.weizmann.ac.il/pandatox/images_new/ic_hh.jpg)
|
Plasmid Coverage information |
Num covering plasmid clones | 5 |
Plasmid unclonability p-value | 0.368421 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGCGAATGC TAAAGATGGC CGCAGCAGTA GTGGTGCTGT TTGCAGGCGT AATTCAAGCG ATGGTGGCCG TGCCGCAATC GCGGGCGGCA AGTACTCAGC CATACAATTG GTCGAATGTC GAGATTGTTG GCGGTGGTTT TGTCCCAGGC ATTATTTACA ATCCAACTGA ACGCGATTTG GTGTATGCCC GCACCGATAT TGGTGGCGCA TACCGCTGGA ACCCCACAAC CAAGCGCTGG ATTCCCCTAA CCGACTGGAT TGGCGCAACC GATTGGAACT GGACAGGGAT CGATAGCCTC GCGACCGACC CAGTTGATCC TAACCGCCTC TATTTGGCCG CTGGCACTTA TACCAACGAT TGGACTTCGG CCAACGGCGC GATTTTGCGC TCGACCAACA AAGGCAATTC GTGGGCACGC ACCGACCTAC CCTTTAAACT TGGCGGTAAT ATGCCAGGCC GCTCGATGGG CGAACGTTTG GCGATTGACC CCAACAAAAA TAGCGTGCTT TACCTTGGTA CACGCGGCCA AGGCCTCTGG CGCAGCACTG ATTATGGCGC AACCTGGGCG CAAGTTACCA GTTTTCCGGC GATTGGCAAC TATACCGATC CCTACTTCAA TGACCGAATT GGGGTAGTAT GGGTAACGTT TGATCCACGG ACTGGCACTA GTGGCAGCGC TTCGCAAACG ATTTATGTTG GGGTTGCTGA TACGACCACC AGTATTTATC GCAGCACTGA TGGTGGGGCA ACTTGGGCAG CATTACCCGG CCAACCAACT GCTGGCTATT TACCCCATCA TGCTACCCTC GCCTCAAATG GCATGCTCTA CATTACCTAC AGCGATACAC CTGGCCCCTA CGATGGTGGC AAGGGTGATG TTTGGAAATA TAACACCGCC ACCCAAGCTT GGACGAACAT TAGCCCAATT CCTTCGAGCA GCGCCGACGA TTATTTTGGC TATGGCGGCT TGGCTGTCGA TGCGCAAAAT CCCAATACGT TGTTGGTAAC TAGCCTCAAT TCATGGTGGC CTGATGCCTT GATTTATCGC AGCACCGATG GCGGCGCAAC ATGGAGCGGT ATTTGGGAAT GGGCCAACTA TCCCGAACGC ACCTTCCGCT ATACCCAAGA TATTAGTACA TCGCCATGGC TCGACTTTGG TGGTAATGCA ACCGCACCTG AAGTTGCACC CAAATTAGGC TGGATGATCG GCGACATCGA GATTGATCCA TTCAACTCAG ATCGCATGCT CTATGGCACT GGCGCAACGA TTTATGGCAC CGATAATTTG ACCGCTTGGG ATACTGGCGG CAAAGTTGCG CTTGATGTGC GTGCCCATGG CCTTGAAGAA ACCGCAGTTC TCGAATTGGT CAGCCCACCA GAAGGCGTGC ACTTGGTGAG TGCCTTGGGT GATATTGCTG GCTTCTATCA TACCGATCTG ACTAAGGCTC CACCATGGTC GGCTAGCCTC AAATTTGGCT CATCGACCAG TATCGATTTT GCTGGAACCC GCCCGAATGT GATGGCTCGC ACAGGCTATG TTGGCGAAGG CAGCACGGTC AAACGGGTGG CGTGTTCATG GGATGGCGGC CAAAATTGGT CGCACGCCAG CAGTGAACCA ACTGGCTCGG TTGGTGGCGG AAAAATCGCC ATGGCTTCCG ATGCTTCGAT GATTGTTTGG AGCGGTACGA CTGCTCCAGT CAGTTACTCG ATGGGCTGTG GTAATAGTTG GTCGGCGAGT GCTGGTATTC CGGCTGGGGC AGTCGTGATT GCTGATCGGG TGAAATCGAC GACTTTCTAT GGCATCGCCA ATGGTACGCT CTATCGCAGC ACCGATGCAG CCCGCAACTT TACCGCTGTT GCAACGGGCT TGCCCACCAA GAGCGCCAAA ATCGAGGCTG TTTTGGGCAT CGAAGGCGAT TTGTGGTTGG CTGGCGGCCA AGATGGCTTG TTCCATTCAA CCGATGGTGG CACAACATTC AGCAAAATTG CTGGAGTGCA GGTTGCCAAT GTGGTTGGTT TTGGTAAGGC AGCACCCAGC CAAAGCTATC AAGCAATCTA CATTACTGGA GCAATTGATG GCGTAGAAGG CTTCTTCCGC TCAGATGATG GCGGCACAAC ATGGGTGCGC ATCAACGACG ATCAACATCA ATATGGCTCG ACCAACGAAA CTATTACTGG CGACCCACGG ATTTATGGGC GGGTTTATAT TGGTACGAAT GGCCGTGGGA TTATTTATGG TGATATTGCT GGCTCAACTC CGACCAATAC ACCAGCCCCA GCCACGGCGA CCGCAACTCG CATACCAACC GCAACTTCAA CCAGCGTACC ACCGACGGCA ACGAGTGTTC CGCCAACCGC GACCTCGACT CGCGTGCCAA CGGCGACTCC GCTGACTTCA ACTGCGATTC CCTATACGCC AACGCCAACC TCAACCCGTG TACCAACCGC AACCCCAATT ACGCCAACCA GTTTGCCATA CACGCCAACG TCAACTCCAG CGCCTGGTGG CTGCCGGATT AACTATGTGG TTAATCAGTG GAATACCGCA TTTACTGGCG GTGTGACAAT TACTAATTTG GGAGCGCCGA TTAGTGGTGG TTGGACGTTG ACTTGGAATT TTGCCAATGG TCAAACGGTG ACGAGCAGTT GGAATACTGT GCTGACCCAA ACTGGTTCAG CTGTGACTGC CAAGCATAGC GCCGATTGGA ATGCCAATAT TCCAACCAAT GGCACACAAA GCTTTGGCTT TATTGCTAGT CATAATGGCA CAAATGCCGT GCCAACCAAC TTTGTTTTGA ATGGCGTTGC TTGTAGCGTT GCGCCATAG
|
Protein sequence | MRMLKMAAAV VVLFAGVIQA MVAVPQSRAA STQPYNWSNV EIVGGGFVPG IIYNPTERDL VYARTDIGGA YRWNPTTKRW IPLTDWIGAT DWNWTGIDSL ATDPVDPNRL YLAAGTYTND WTSANGAILR STNKGNSWAR TDLPFKLGGN MPGRSMGERL AIDPNKNSVL YLGTRGQGLW RSTDYGATWA QVTSFPAIGN YTDPYFNDRI GVVWVTFDPR TGTSGSASQT IYVGVADTTT SIYRSTDGGA TWAALPGQPT AGYLPHHATL ASNGMLYITY SDTPGPYDGG KGDVWKYNTA TQAWTNISPI PSSSADDYFG YGGLAVDAQN PNTLLVTSLN SWWPDALIYR STDGGATWSG IWEWANYPER TFRYTQDIST SPWLDFGGNA TAPEVAPKLG WMIGDIEIDP FNSDRMLYGT GATIYGTDNL TAWDTGGKVA LDVRAHGLEE TAVLELVSPP EGVHLVSALG DIAGFYHTDL TKAPPWSASL KFGSSTSIDF AGTRPNVMAR TGYVGEGSTV KRVACSWDGG QNWSHASSEP TGSVGGGKIA MASDASMIVW SGTTAPVSYS MGCGNSWSAS AGIPAGAVVI ADRVKSTTFY GIANGTLYRS TDAARNFTAV ATGLPTKSAK IEAVLGIEGD LWLAGGQDGL FHSTDGGTTF SKIAGVQVAN VVGFGKAAPS QSYQAIYITG AIDGVEGFFR SDDGGTTWVR INDDQHQYGS TNETITGDPR IYGRVYIGTN GRGIIYGDIA GSTPTNTPAP ATATATRIPT ATSTSVPPTA TSVPPTATST RVPTATPLTS TAIPYTPTPT STRVPTATPI TPTSLPYTPT STPAPGGCRI NYVVNQWNTA FTGGVTITNL GAPISGGWTL TWNFANGQTV TSSWNTVLTQ TGSAVTAKHS ADWNANIPTN GTQSFGFIAS HNGTNAVPTN FVLNGVACSV AP
|
| |