Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Haur_1231 |
Symbol | |
ID | 5733139 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Herpetosiphon aurantiacus ATCC 23779 |
Kingdom | Bacteria |
Replicon accession | NC_009972 |
Strand | + |
Start bp | 1425804 |
End bp | 1428953 |
Gene Length | 3150 bp |
Protein Length | 1049 aa |
Translation table | 11 |
GC content | 50% |
IMG OID | 641278371 |
Product | NHL repeat-containing protein |
Protein accession | YP_001544007 |
Protein GI | 159897760 |
COG category | [G] Carbohydrate transport and metabolism |
COG ID | [COG2133] Glucose/sorbosone dehydrogenases |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 0 |
Plasmid unclonability p-value | 0.001634 |
Plasmid hitchhiking | Yes |
Plasmid clonability | hitchhiker |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGATCAAAC GACAGTTTCA TGCGCGGGTG GTTGTGATGA TAGTAATCTT CAGTCTGTTG GCTGGTGGAT TGCTGTTGCA ACCAACGATC CATGATGCAG CCGCCCAAAC GCGCTCAACA GTTCAATATA GTTATAATCC TGAGCGGCAA CATGTGCATT CTGGGAGCAT TAGTTTTTTT CGCCAAACGA TTTTTAATTC AAGTTCGGCA TTAACTGGCG CTTCGTTGTT CGATAGCCCA ACCGGGATGA TCTTCGGCCC TGATGGGCGT TTGTATGTTA GCCAATTTAA TGGGCGAATT TTTGCAATTA CGCTCAACCC AACCAACCAT CAAGCTACCG CCGTTCAGGT GATCGAAACA ATCTACAATT ATCCTAATAC TAACGATGAT GGTTCGCCCG CCACGGGCGT AAATGGTCGT CAGTTGATTG GGATTTTGTT TGATCCGGCT TCAACGCCAT CGAATATGAT TTTGTATGTT TCGCATAGCG ACCCGCGCTA TGGCTACAAC AGCGACACAA CCTCGCAAAA AATCAACACC CGCTCAGGCA CAATTAGCAA ATTGGTCAGC ACCAATAATT TTGCCACCTC AACCGATATT ATTAAGGGTT TGCCGCGCTC ACGTGAAAAC CACAGCACCA ACAGCATTGT TTGGGGGCCA GATGGCTGGA TGTATATTTC GTCGTCGAGC AATACCAACA ATGGTGCGCC AAGCGCTCAC TTTAGCAATT TGGCCGAATA TTATCTCTCG GCGGCAATTT TGCGAGCCAA TGTGCGCGAT GCTGGCTTTC CTTCAGCAGG CATTGATGTC AAAAATGTCA ATAATGCGGC GGCGTTAGCA CCATTTGCTG GCCAATTTGA GATGTATGCG ACTGGCTATC GCAATGTTTA TGATTTTATT TGGCTGAATA ATAAGTTTTA TGCCAATGTC AATGGTGGCA ATGATGGCTA TGGCGATACG CCGAGCGCCG CTCAATGCCC TGGTGGGGTG GAAATTAACC CACGCTATTT GCCCGATACC CTGAAAATTG TGACCCAAGG TTCGTATGGT GGCCATCCCA ACCCAGCGCG TGGCGAATGT GTTTTGAACA ATGGGAATGC CCAAGGCTTG CCAACGCCAC CAAACTACAA TCCACCAGCC TTGACCTATG ATTTCCGCCC ATCCACGAAT GGGATTACGG CGTATATCAG TAATGCCTTT GGCGGTGCGT TGCAAGGCAA TTTGATTGTC GCAACCTATG GCCAAAATCA AGATGTCAGC CGCGTTCAGC TTGATGCCAA TGGCAACCCG ACTGGCATCC AAACCTTGGC AAAGTTCAAC CAACCGCTTG ATGTGGTGAC CGATGCCAAG GGTGTGATTT ATGTGGCCGA ATTTGGCAGC GATGCGATTA CCATCCTTGA GCCAGAAGAT TTGCCCAATT GTCCGCCGCC GCCGCCCTAT TCGAGCATGG TCGATAGCGA TAATGATGGC TACTCGGATT TTGATGAAGA TCAAAATGCA ACCAATTTGT GTAGCCGAGC CAGCGTACCC AACGATTTCG ATGGCGATAT GATTTCCGAT TTGACCGATC CCGACGACGA TAACGATGGC ATCCTTGATG TCAATGATCA GTTGTATTTT GATCCAGCGA ATGGCAATAA CACCAAATTG CCCGTTGAAT TCAATTGGAA TCCTGGTGAT GCACCGTTGG GCAAAGTTGC TAATACGGGC TTTACTGGGG TGCAAATTAC CAATGGAGTT ACGCGAACTG ATGAGGTTAA GATTGCAGTT GGCGCTGCTG GTGGCTTTAC CAACATCATC ACCACCGAAG GCACAAATGC TGGTGCGATT AATACCCAGG CCAACGCTTT GCAAATTGGC TTTGATCCCC ATCGCAACTT CCGAGTTGAA ACGCGGATCA CTGAAATGTT CCTTGGGCGC ACGCCTGAGG GCGATCAAGC GGCGGGCTTG TTCTTCGGCC CCAACCAAGA TAACTATGTG AAATTGGTGG CGACGGCCAA TTTGAATGGC GATGGCACAA GTGGCGACAC CGGGCTGCAA TTTGCCATGG AAGTGGCCGG AAACTTGCAA ATCAATCCAA CCAATAATAA TCCAAGCATG GTATTGCCAG GAGTGAGCAA TATCGACCTG CGCTTAGACG GCAATCCTGC TGCGCAGACC ATGACCGCCT ATTATAAAAC TGATGGTGGC AACTGGACAG CAATTGGCAC GATCAATGTG CCTTTGTGGT TCTTTAATAC CGGAACACCA GCAGGAATTT TGGCAACTAA TCATGGAACA CCAAATACCA TCTCATTTAT TTTCGATTTC TTCCGTTTGA GTTATCGCGA TATTGTGGCG CGGATCAACG GCGGTGATCG CGATTGGACG ACCAACGACG GAGCTGCTTG GAGTCCTGAT ACAAGTGCCA ACCGTCCATA TTCAACTGGC GGCGGCACCT ATGTCTTGAT TCCAGGCGTT TCGTGTCCCC AAATCTACAA CGTGACTCGT GACTTTGATC GGTTGTACTG CTCTGAACGC AACACCAGCG GCTCGGCGGT GCAATACACA ATTCCGGTCA GCAGCACAGG AACGTATGCG GTGCGCTTGC ACTTTGCCGA ACTGTATTGG GGCACGAATG GTCGGCCTGG GCCGAACCAG CGTAAATTCA ATGTTCAGCT TGAAGGCACG ACTGTGTTGA CCAATTTCGA TATTTTCAGC GAAACTGGCG CAACCTTCCG GCCAATTACC AAGACCTTTA CCACCAATAT TACTGATGGT GCGGTCAATA TTAGTTTGCC CACGGGCCAG CCTGGCAATG TTGATCAAGG CAAGATTTCG GCGATTGAGG TTTGGGGGCC AATTCTCAAT ATTGCCAATG TTACGCCAAC GGTCACGCCA TCGCCAACGG CAACCAATAC GCCATCGCCG ACGGCAACCA ACACGCCAAC TAATACCAAT ACGCCAACAA TCACGAATAC ACCGACCAAT ACCAATACGC CATCGCCAAC ATCGACTGGC ACGGTGACGG CGACCTTTAC ACCGACCAAC ACGGCCACGC CAAGTAATAC GCCGACCAAC ACGCCAACCG AAACCGTAAC GCCGAGCGCT ACGGCAACCC AAGGCACGGC GACCTATCGA ATTTGGCTGC CGTACACTGT CAAAAACTAA
|
Protein sequence | MIKRQFHARV VVMIVIFSLL AGGLLLQPTI HDAAAQTRST VQYSYNPERQ HVHSGSISFF RQTIFNSSSA LTGASLFDSP TGMIFGPDGR LYVSQFNGRI FAITLNPTNH QATAVQVIET IYNYPNTNDD GSPATGVNGR QLIGILFDPA STPSNMILYV SHSDPRYGYN SDTTSQKINT RSGTISKLVS TNNFATSTDI IKGLPRSREN HSTNSIVWGP DGWMYISSSS NTNNGAPSAH FSNLAEYYLS AAILRANVRD AGFPSAGIDV KNVNNAAALA PFAGQFEMYA TGYRNVYDFI WLNNKFYANV NGGNDGYGDT PSAAQCPGGV EINPRYLPDT LKIVTQGSYG GHPNPARGEC VLNNGNAQGL PTPPNYNPPA LTYDFRPSTN GITAYISNAF GGALQGNLIV ATYGQNQDVS RVQLDANGNP TGIQTLAKFN QPLDVVTDAK GVIYVAEFGS DAITILEPED LPNCPPPPPY SSMVDSDNDG YSDFDEDQNA TNLCSRASVP NDFDGDMISD LTDPDDDNDG ILDVNDQLYF DPANGNNTKL PVEFNWNPGD APLGKVANTG FTGVQITNGV TRTDEVKIAV GAAGGFTNII TTEGTNAGAI NTQANALQIG FDPHRNFRVE TRITEMFLGR TPEGDQAAGL FFGPNQDNYV KLVATANLNG DGTSGDTGLQ FAMEVAGNLQ INPTNNNPSM VLPGVSNIDL RLDGNPAAQT MTAYYKTDGG NWTAIGTINV PLWFFNTGTP AGILATNHGT PNTISFIFDF FRLSYRDIVA RINGGDRDWT TNDGAAWSPD TSANRPYSTG GGTYVLIPGV SCPQIYNVTR DFDRLYCSER NTSGSAVQYT IPVSSTGTYA VRLHFAELYW GTNGRPGPNQ RKFNVQLEGT TVLTNFDIFS ETGATFRPIT KTFTTNITDG AVNISLPTGQ PGNVDQGKIS AIEVWGPILN IANVTPTVTP SPTATNTPSP TATNTPTNTN TPTITNTPTN TNTPSPTSTG TVTATFTPTN TATPSNTPTN TPTETVTPSA TATQGTATYR IWLPYTVKN
|
| |