Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Haur_0437 |
Symbol | |
ID | 5732336 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Herpetosiphon aurantiacus ATCC 23779 |
Kingdom | Bacteria |
Replicon accession | NC_009972 |
Strand | - |
Start bp | 509661 |
End bp | 512714 |
Gene Length | 3054 bp |
Protein Length | 1017 aa |
Translation table | 11 |
GC content | 51% |
IMG OID | 641277563 |
Product | peptidase M4 thermolysin |
Protein accession | YP_001543216 |
Protein GI | 159896969 |
COG category | [E] Amino acid transport and metabolism |
COG ID | [COG3227] Zinc metalloprotease (elastase) |
TIGRFAM ID | [TIGR01451] conserved repeat domain |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 0 |
Plasmid unclonability p-value | 0.00224619 |
Plasmid hitchhiking | No |
Plasmid clonability | unclonable |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGCGTTCGA TGCGAATGCT GCGTTTGCTA GCAGTATTAA TTGTTGTGAG CATGCTGCCG TTAAGCCGGA GCAATGCCCA AACGCCCACC ACCAATTCGC CCATCAATCG CTTGCTCAGC CAAGGAGCCA ATGTTGCTCG CCACAGCCAA ACTGGCTATG TGCGCTGGAT TGGCGCAGCC GCTGGCACAA CCCTCGATCA ACCAGTAGGC TTGAATCGCG CTGATACGCT CAACATTGCC AATGCCTATT TGGCTGAATA TGGCAATTTG TTCGGCTTAG CTCAAGCCAG CGATGCCACC GTTGCCAAAA CTAGCACCAC CCAAGATGGT CGCCAGGTTG TGCGTTATAA ACAGCGCTAC CAAGGCGTGC CAGTCATTGC TGGCGAATTG CAAGTTCAAG TCAATCAGGC TGGTGAATTA CTGAGCATCA ACGGCGAAGT GTTGCCAAAT CTCAATTTAA ACACCAACCC CAAAATTGAC CAAGCCAAGG CGATTAATAC CGCCCAAACC TATGTTGCCA CCAAATACGA TGTCAATTTG GCAGCGTTGA ACGTGAGCAA AAGCGAATTA GCAATTTATA ACCCAGCTTT GTTGGGTGGC ACAGGCTTAC AAAAAAGCAG CCTGGTTTGG TTTGTTACGG TTGAATCAAC CAGCAGCAAC CCGATTCGCG AATATGTGTT CGTTAACGCC CAGTCAGGCG AAATTAGCTT AGCCTTCAAT CAAATTGGCT TTGCTAAAAA TCGCTTCGTT TGTAACGATA ACAACGTGGT CGATAACGAC GATAATCCCA ATAACAACTG TGATCAACCA AGCGAATATG TGCGCACCGA AGGCTCAGCC GCTAGCAACG ATGCTGATAT CGATGCCGCT TACGATTATT CGGGTGATAC CTACGACTTC TTCAACTTGT TCTTTGGCCG TAACAGCATC GACAACAACG GCATGAACTT GATCTCACTG GTCAAGCACT GCCCACCTGG CGCGGGCTGC CCGTATGGCA ACGCCTTCTG GAATGGTCGC CAAATGACCT ATGGCCAAGG CTTTGCTGCT GCCGACGATG TGGTTGCCCA CGAATTGTCG CACGGGGTAA CCGAATATAC CTCGAACCTG TTCTACTACT TCCAATCAGG TGCAATCAAC GAAGCCATGT CGGATATCTT CGGCGAATTT ATGGATCAAA CCAATGGTAG CGCCGACGAT GACGAAACTA CCCGCTGGAT TATGGGCGAG GATTTGGGCG GGATTCGCAA TATGCAAGAC CCAACCCAAG GTGATCAGCC CGACCGCATG TTGAGCGAAT TGTATGTGCT CGATCCTAAT CTGCTTGATA GTGGTGGGGT GCACAGTAAC AGTGGTGTCG CCAATAAGGC TGCCTATCTT TTAAGCGATG GCGATACGTT CAACGGCCAA ACCATCGCCT CAATTGGCAT CACCAAAACT GCCCATTTAT TCTATGAAAC CCAAATTAGC TCGTTAACTT CGGGCAGCGA TTATGCCGAC CTTGCTTCGG CCTTGCGCCA AGCCTGTACC AGCCTGACAG GCAGCCATGG GATTACCGCC GCCGATTGTG TTGAAGTCAA TAAAACCATT TTGGCAACCG AAATGGATCT TCAGCCAACC AACGCGCCAG CCCCTGATAT TGCTGCTTGT AGCCCTGGCC AAACCCAAAC GACCGTTTTC AACGAAGATT TTGAGGCCTT GCCCAATAAT CGTTGGAGTG CAAGCGCCTT GAGTGGCAGC ACCGAAACTT GGACTACCAA TCCAGCGCCA CTCGTTGGGA CATATGCAAC TAGCGGTAGT GGCTCAGCAA CCAACTATAA CGGTGTGTAT GGGTGGAATC AGGCGGTTGA AGAATCGGCC TTTACCCAAA ATGCGAGTGT CACGGTTCCC GCCAATGGCT TTGCCTACTT CCGCCATGCC TATGATTTCT ATGCGCCAAT CGATGCGGGC GTGGTGGAAT ATAGCACCAA CAATGGTAGT ACTTGGCAGT CAGCTGCTGG TTTGTTTAGC TTCAACGGCC CAACTAATAT CGCCGAAGAT GGCGCGTTTG GCGGCGAGCA AGCCTATGTT GGCAACAGCA ATGGCTATAA CGCCAGCCGC TTGAACTTAG CCAGCTTGGC TGGGCAATCG ATCCGCTTCC GATTCAAGGT CGCGCTTGGC GATACCCCAA CGCCAAATCG CTCAGTTACC TGGTCGATCG ACGACTTTGA AATTTACACC TGTAATGGCC CAGCCGCACC AGCCGCGCCA AGTTTGTTCC TTAGCAGCGA CGTATTGGTG CAACAAGGCA AGCAAACCAG CGCGACCATT GGCACTGCCT CGGATGATCT TGATTTGGCG GGTGCGTTGA CAGTCAGTAC CGCTGGCGCA CCAACAGGCA TGAACGTTAG CATCAGCAAC CAAGCTGGGA TCTTGAAGGC TAACGTGAAT TGTGCCTGTA GCAGCGCAGT TGGCACCTAC CCGATTACCG TAACTGTGCG CGATAGCGGC AACCTGACAG CCAGTCAAGT ATTTAATATC GAGGTTGAGC CAGCTGGCCA AAGCCTGATT AATGGTGGCT TCGAGCTTGG CACCAATTGG CAAGCCTTCT CAAACTCATT TAGCGATGTT TCGCCGCCAT TGTGTTCATT GCCTGGTTGT GCTGGCGCAC CACGGACTGG CACAAGTTGG TTGCGCTTTG GCGCACGCTC AGGCACGACC GAAACAGCCT TTGCCCAACA AACCTTCACG ACAACCGCAG GCGATGCAAC GCTGGAATTC TACCTCTCGA TTTACGCGCA TAACGGTCGG GGAATTCAAG ATTACGTCAA GCTTAGCCTT GATGGTGAGG AAATTTTCCG GGTTAGCGAC GCAAATACCG AATATGACAA TGGCTATGCC AAAGTCACGA TTCCGTTGAC CGATCTGAGT GCCGGCGAGC ATGTCTTGCG CTTCGATTCA GTCAATAACT CAACTGCCAA CATTGTGCGC TTTAGTATCG ATGATATCAG CTTCGTAACT GAAACCAACC AGTGCCGTGG CTCGAATCTC TATCTACCAA TGATCACCAA GTAG
|
Protein sequence | MRSMRMLRLL AVLIVVSMLP LSRSNAQTPT TNSPINRLLS QGANVARHSQ TGYVRWIGAA AGTTLDQPVG LNRADTLNIA NAYLAEYGNL FGLAQASDAT VAKTSTTQDG RQVVRYKQRY QGVPVIAGEL QVQVNQAGEL LSINGEVLPN LNLNTNPKID QAKAINTAQT YVATKYDVNL AALNVSKSEL AIYNPALLGG TGLQKSSLVW FVTVESTSSN PIREYVFVNA QSGEISLAFN QIGFAKNRFV CNDNNVVDND DNPNNNCDQP SEYVRTEGSA ASNDADIDAA YDYSGDTYDF FNLFFGRNSI DNNGMNLISL VKHCPPGAGC PYGNAFWNGR QMTYGQGFAA ADDVVAHELS HGVTEYTSNL FYYFQSGAIN EAMSDIFGEF MDQTNGSADD DETTRWIMGE DLGGIRNMQD PTQGDQPDRM LSELYVLDPN LLDSGGVHSN SGVANKAAYL LSDGDTFNGQ TIASIGITKT AHLFYETQIS SLTSGSDYAD LASALRQACT SLTGSHGITA ADCVEVNKTI LATEMDLQPT NAPAPDIAAC SPGQTQTTVF NEDFEALPNN RWSASALSGS TETWTTNPAP LVGTYATSGS GSATNYNGVY GWNQAVEESA FTQNASVTVP ANGFAYFRHA YDFYAPIDAG VVEYSTNNGS TWQSAAGLFS FNGPTNIAED GAFGGEQAYV GNSNGYNASR LNLASLAGQS IRFRFKVALG DTPTPNRSVT WSIDDFEIYT CNGPAAPAAP SLFLSSDVLV QQGKQTSATI GTASDDLDLA GALTVSTAGA PTGMNVSISN QAGILKANVN CACSSAVGTY PITVTVRDSG NLTASQVFNI EVEPAGQSLI NGGFELGTNW QAFSNSFSDV SPPLCSLPGC AGAPRTGTSW LRFGARSGTT ETAFAQQTFT TTAGDATLEF YLSIYAHNGR GIQDYVKLSL DGEEIFRVSD ANTEYDNGYA KVTIPLTDLS AGEHVLRFDS VNNSTANIVR FSIDDISFVT ETNQCRGSNL YLPMITK
|
| |