Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Haur_3926 |
Symbol | |
ID | 5735787 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Herpetosiphon aurantiacus ATCC 23779 |
Kingdom | Bacteria |
Replicon accession | NC_009972 |
Strand | - |
Start bp | 4919015 |
End bp | 4921951 |
Gene Length | 2937 bp |
Protein Length | 978 aa |
Translation table | 11 |
GC content | 51% |
IMG OID | 641281077 |
Product | von Willebrand factor type A |
Protein accession | YP_001546688 |
Protein GI | 159900441 |
COG category | [S] Function unknown |
COG ID | [COG5426] Uncharacterized membrane protein |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 5 |
Plasmid unclonability p-value | 0.501815 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGGGGATTT CGTTTGTTGC GCCAAGCTAT TTATGGTTTT TATTACTACT TATACCCGTT ATTGCCTTAG GCTGGATGAA CGGGCGGCGT TTCCAACGTA GCCGGTTAAT TGGCTCATTG CTGTTGCGCA GTTTGTTGTT AATCAGCCTG ATTGGCTCAC TGGCAGGCGC ACAAATCATC TCTCCCGTGC AGCAACTCAC CACGATTTTT TTGGTGGATA CCTCAGATTC GATCACTCCC AATCAGCGTA GTTTGCAAGA TCAATTTATT GCTGATGCGC TGCAAACTAT GCCCAAAGAT GATCAAGCAG CGATTATTGT GTTTGGCCAA AATGCCTTGA TCGAGCGCTT GCCCTCGGAA GTACGCACGC TCAGTCGCAT TCAATCGGTA CCAATCGCCG CACGAACAGA TTTAGAACAA GCTATGACCC TGAGCTTTGC GCTGTTTCCG GCTGATACTC AAAAACGGGT AGTTTTGCTC TCTGATGGTG GCGAGAATAG CGGCCAATCG CTCAAAGCCC TTGAGTTGGC CAACGATCGT CAGATTGTCG TCGATGTGGT GGAAATTGCC CAAATTGGCG GAGCTGAGGT GGCAATTACT GCTTTGCGCA TGCCCGGCCA AGCCCGCATC GGCCAAGAAT TACAAATTGT AGCCCAAATT GATAGTAACG AAGCTCAGGA TGCTACCATT CGGATTTTGG TTGATCGTCA ATTATATGCC GAATCAAGCC TTAGCTTGCC CAAAGGCACC TTGGAATATA CCTCAACTGT GGTTTTGAAC GATCAGGGCT TTCACAAAGT TTCAGCCCAG ATTATTCCAA CTAATGATAT TCGCAAACAA AATAATGAAG CCACGGCTTT GGTTAATTTG CAAGGCCCGC CCAAAGTGCT GATCGTTGCC AACGATCCTG CCGATGCTGA AAATATTGCG CCAGCACTAG AGGCTGCCAA ATTACAGGTA ACTGTGGTTG GTCCAACTGG GCTTCCGACA ACTCTCGCTG ATTTAGCCGA TTACGAAGCA GTTGTCTTGG CGAATGTGCC TGAACGCCTG ATTGCCGATG AAGCGCAGCA AGCACTCCAA ACGTTTGTGC GCGATCTTGG GCGTGGCTTT GTGATGCTTG GCGGTGAAAA TAGCTTTGGC ATTGGGGGCT ATACCAGTAC CCCCATCGAA GAATTGCTGC CCGTAGAAAT GCAATTGCGC AATCGCGAAA AATACCCGCC AGTGAGCGTC GCGGTGATCT TTGATATCTC GGGCAGTATG AGCGAAGTCG TTGGTGGCCG CCAAAAAGTA ACCTTGGCCT CTGAAGGTGC GGCACGAGTA GTACAATTGC TGCGCGATTT TGATGAAATT ACCGTGTTGC CGTTTGATAG CGCTGTGCAA AACCAATATG GGCCAGTTGC TGGCTCGGAG CGTGAAGTTG CCCAAGGCGA GATTATCGCC CGTGGCGTGA CTGGCGGCGG CGGGATTAAT GTTCATGATT CGTTGGTTGC GGCTGGCAAT GTACTTAAAG GCCGCAATGC CCCGATTCGC CACATTATTT TGCTGGCTGA TGGCTCGGAT TCGCAGCAAC AAGAAAATGC TGTGCGTCTA ACCGATGAAC ATCGGCGTTT AGGCATTACC ACCAGCACGA TTGCGATTGG CAATGGTGGC GATGTGGGCT TTTTGAATAA TGTGGCGGTG GCTGGTGGTG GACGGCATTT TTTAGTTGAA GATGCATTAT CGTTGCCCGA TATTGTGCTG CAAGATGCCC AACTTTCGCT AGCCCCCTAT ATTGTTGAAA AAGCCTTCTT GCCTTTGCTT GGTAGCGATA GTGTGATTAT GGCCGAGTTG AATACCGCCA ATTGGCCGCA ACTGCTGGGC TATAACGGCA CGATGCCCAA GCAAAATGCC AATATGGTGT TGTGGGCCGA TGAAGATGCG CCGCTCTTGG CTCAATGGCA ATATGGCTTG GGTCGTTCAG TCGCTTGGAT GAGTGATATG AAAGGCAAAT GGGGGGCAAA TCTGGTGCGC TGGGATCAAT TCGAGCGCTT AGCGGCTCAA ATTGTTGGCT GGACACTACC TGTCATCTCC AATGAAACAA TTAGCATTAA TACAACGTTT GTTGGCCCAG AGATGGAAAT AATTTTAAAT GCTCGCGATG CCAATGATGC TCCATTAACT GGGTTGATGA TCGATGGCAA CGTGGTCAAC GATGGCGGGG TGCAATCAGG CTTGACCTTG GTTGAAGTGA GTGCTGGCAT CTATCAAGGC CGAATTGCCA GCCCTGGCGC GGGAACCTAC TTTTTGCAAT TGGGCGGACG CAATCGCGAT GGCCAAGTGG TGTTTCAAGA AACGGCAGGG GTAATCGTGC CCTATTCGCC TGAATATCGC CAAGGCCAAG CCAACCCCAA TTTGCTGGCG ACGATCGCTC AACGCAGCCA AGGCCGCGTT TTAACCGATG CGACCAAAGT TTTTGAGCAT TCACTTGAAT TGGTTACGCG AGCTACGCCG ATTCACTTTA GTTTGCTATT GGCCGCGCTG GCTTTCTTGT TGCTCGATAT TGCGGCGCGA CGCTTGCGCT TTGGTCAATT AGGCCAAGCT TTGGCCGGAG TGCGTCGTCG TCAAGCGACA CCTTCGCCAA CGATGGGCGA TTTAGCCGCC GCCAAGCAAC GTGCCCGTAA TAAAATGGGC CAAACTCAAA CAGCGGCTGA GCCTGCCCCG ATTACCCCGA AAAATACTCC AGTCTATCAA CCTGGGGCCA ATTATACCCC GCCAGTAGCT CAGCCCAAGC CCGAAGCAGC GCCTTCAGCT TCTACAAAGG TTGACCCAGC GCCCAAGCCG AGTATGCCAG CACCGACCCC AACCCAAGGG CCACCACCCA AAGCCGTTAA TCTTGATGAG ATTACTGACC CCTTAGAACG ATTACGTGCA GCCAAAAACC GTGCCCGACG GCAATAG
|
Protein sequence | MGISFVAPSY LWFLLLLIPV IALGWMNGRR FQRSRLIGSL LLRSLLLISL IGSLAGAQII SPVQQLTTIF LVDTSDSITP NQRSLQDQFI ADALQTMPKD DQAAIIVFGQ NALIERLPSE VRTLSRIQSV PIAARTDLEQ AMTLSFALFP ADTQKRVVLL SDGGENSGQS LKALELANDR QIVVDVVEIA QIGGAEVAIT ALRMPGQARI GQELQIVAQI DSNEAQDATI RILVDRQLYA ESSLSLPKGT LEYTSTVVLN DQGFHKVSAQ IIPTNDIRKQ NNEATALVNL QGPPKVLIVA NDPADAENIA PALEAAKLQV TVVGPTGLPT TLADLADYEA VVLANVPERL IADEAQQALQ TFVRDLGRGF VMLGGENSFG IGGYTSTPIE ELLPVEMQLR NREKYPPVSV AVIFDISGSM SEVVGGRQKV TLASEGAARV VQLLRDFDEI TVLPFDSAVQ NQYGPVAGSE REVAQGEIIA RGVTGGGGIN VHDSLVAAGN VLKGRNAPIR HIILLADGSD SQQQENAVRL TDEHRRLGIT TSTIAIGNGG DVGFLNNVAV AGGGRHFLVE DALSLPDIVL QDAQLSLAPY IVEKAFLPLL GSDSVIMAEL NTANWPQLLG YNGTMPKQNA NMVLWADEDA PLLAQWQYGL GRSVAWMSDM KGKWGANLVR WDQFERLAAQ IVGWTLPVIS NETISINTTF VGPEMEIILN ARDANDAPLT GLMIDGNVVN DGGVQSGLTL VEVSAGIYQG RIASPGAGTY FLQLGGRNRD GQVVFQETAG VIVPYSPEYR QGQANPNLLA TIAQRSQGRV LTDATKVFEH SLELVTRATP IHFSLLLAAL AFLLLDIAAR RLRFGQLGQA LAGVRRRQAT PSPTMGDLAA AKQRARNKMG QTQTAAEPAP ITPKNTPVYQ PGANYTPPVA QPKPEAAPSA STKVDPAPKP SMPAPTPTQG PPPKAVNLDE ITDPLERLRA AKNRARRQ
|
| |