Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Tpen_1361 |
Symbol | |
ID | 4602198 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Thermofilum pendens Hrk 5 |
Kingdom | Archaea |
Replicon accession | NC_008698 |
Strand | - |
Start bp | 1315699 |
End bp | 1318458 |
Gene Length | 2760 bp |
Protein Length | 919 aa |
Translation table | 11 |
GC content | 67% |
IMG OID | 639774136 |
Product | hypothetical protein |
Protein accession | YP_920761 |
Protein GI | 119720266 |
COG category | [R] General function prediction only |
COG ID | [COG1353] Predicted hydrolase of the HD superfamily (permuted catalytic motifs) |
TIGRFAM ID | [TIGR02578] CRISPR-associated protein, Csm1 family |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 7 |
Plasmid unclonability p-value | 0.339781 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGCGTCCAG GCCTGATACT GCTCTACGAG GCCGGGCTAG CGCTTTCGCG CCACGCGAAG AGAGCGGCGG AGGGCAGGGA GAAGTGGAGG TGCAGGGAGC TCGGAAGCGA GAAGGAGCTC GCAGACCCGA AGACAGCCGG GGCTAAGCTT GCAGAGTGGT TCGCGAGGAG GGCTGGGCTC CGCGGAGACC TCCAAGGCAC GGCCGAGGAG GCCCTGAGCC TAATAGAGAG GGTGGACCCG CTGTCCGCCG GCGCCTTCTC GGGCGTGGAG GAGTACGCGG GGAGAGTGCG GGAAGTCTGG GCGGGCGTCT CCGACCCCTG GGACGCGCCG CTACTCAGCC CGCTCCGGAT CCTCGAGAAA ATGGGCTACC GGGAGCTCCT GTCCGGCTCC AGGGCCGAGC TCGACGCGGA GGCTCCGAGG AGGCTCCTTG GAGACCCGGA GGAGCTGAAG AGGTACGTCG GGGAGGCGAA GGGCAGTAGG TCGTGGCTCG CGGCGAGACC CCTCCCGCGG GAGTTCACGG AGATCCTCGA AGCCTTGAAG CTACGCTCCT TCGAGGAGGC AGTCAGAGGA AGCGACTACG GCGAGGTCGC CGAGGCGCTA CTCTCCATGC TCGGGAGGGC GGCGGAGTTC TACGAGGCTC TCGGCTCGTG CAACGGCGTC GACGACACGC TTGAAAGCGT GGTTTCGAGC ACCCTAGCGC TCGTGCCCTC GCACCCGCTC CTGCCCTCCG TGCCCCTGCC GGCTCTGGCG AGGCTGCACG CCGCCCTGGC GTCCGCTGGA GAAGGGTTCG TCCTCCTAGG CCTGGACGTC AACGGGATAC AGGACTTCGT CTACGCGCCG GTAAGGGAAA GCGCGGCCTC GAGGGTGCTG AGGGGCAGAA GCCTACTGGT GGAGCTCGTG CAGTTCACAG CGACCAGGAT CGTGCTCGAG CTCGCCGGGG CGACGTGGTC GGCCCAGCTC AGCAAGGAGG GCGGAGCCCC CACCTTCGTG CTACCCGGCT CGTTCAGGGA CAAGGTCGAC GCGCTCAGGG AGATCCTGGG AAGGTGGATG GCCAGGCAGT TCAAGGGCTC GCTGTGGCTC ACGGTGGCGT GCTCCGACCC CGCGCCGACA GCCAACGCCG CGAATCCTGG AGGCGCGTTC GTAGCCGTCT CCGAGAGCTA CTCAACCGCC GTGGCCTTCT CGAAGGACTC GCGCTTCTCC GGGTTCGCGG CGGAGCTCGC GGCCCTCGGG AACGAGGACG TCGAGGACCT CGACTCGCTG ACCAGGGAGC CCGTCCTAAA GGGCGACGGG CTCAAGCTCA GGGTGGAGCG GAGCACCAGG GACTACGCGG AGAGCCTAGC ACCCGACAAG CTCTCCGACG GGGACCTGCT CGGAGGGCTC ACGCACATCA GCCTGGCCTG CGGAAGCGCC GCGAGGAACC TTGCGGCCGT AGTCTCGCTC TACTTCTTCA AGGGCGGCAG GCCCTGCCGC GAGGCAGCGG AAGCCGTGCT CGGCGCCGTG TTCAGGGAGC GGGGCAGAAG GCTCTACTTC GTCCTCGAAG GGGACCAGCC CCTCGCCGTC GCGCTCGCAC CCTTCACGGA GGCGGGGGCG ATCCACGTCC TCGTGGCGCA CACGAGGCGG GAGCTCGCCA AGGCGGAGGA GAACCTGAAG GCCGCGGAGA CCCTCCTCAG GGTCATGAAG GGGTACGTGA AAGGGGCTGC GGGGCTCGCC GAGAAGATCC ACGTGGAGGT GAGGGTCGTG AACGCCCCGC GCGACTTCAT ACCCTCGGCG GGCGTCGCGG AGCACTACAG GGGGTTGCCC GTGGCCTTCT CATTCGCGCC CTACTACACC AACCCGTACC ACCCGGTGAG GGCCGAGGAG GAGGCGCGGC CGATAATGGG AGGCGAAAGG CGCGTCGCCG AGAAAGTGCG CTTGCTGGAC CTGGACGACA TGGGCGTGAT CGCCGTCTCC CTCCTTGACG CAGACAAGAT GGGGGACGTG TCGAAGTCCC TAGGCGCCTA CCCCGCTATC TTCTCCGCGG TCTCCGACTT CCTCTCAATG GGGTTCGGCG TAAAGGCGTA CTCGGCCGTC CTAGCAGACG CGAAGAAGAG GGAGGGGGGA AGGGGGATCC TCCTCTACGC CGCGGGAGAC GACCTAGCGG TCTACGGCGA GTGGCACGAC GTGCTCAGGA TACTCGACAT CGTGAGGAGG GAGGTCCTCG AAGGGCTCCT CAACCCCCTC GCCGCGAGCA GCGGGATGGC CGTAGCAGAC AGGAAGGACC CCGTGCTCCA CGTCTACGAC GCGGCGAGAG CCGCCGAGAG GGAGGCGAAG AGGCGGAGCC AGCCCGGCGG CTCCCTCGCC GTGAGCGGTA TCTGCAGGAA GCCCATACCA ATGTCGGGCG GCGAGCTGGA CCTCGCGAGG CTAGCCGCGG TAGTCGGAGC GGCCCTCGGA GAGGAAGCCG GGGAGTTCGA GAGCCTCAAG CCCCTAGTCT ACGCCCTCGC CGAGCTGGCC CACGAAGCCG AAGAGGCGGA GGAGGCTCTA GGCAAAAGCC CGCTCAGAGT CGCGAGGGTT GTCGTACACT ACCGCTACCT CGAGGCTAGG CGCGGAAAGG ACTTCGAGAA GCTCTCCGAG CTCCTCGGCG AGCACGGAGT AAGGCTCCCC GGGCCCAAGG ACCCGAGCAA CGAAGTCGTG AGAAAGCTGG CAGCCCTCAA GCCGCTCTTC GACTTCCTAG CCCTCGCCGC GAGAAAGCCT AGAGAGGCTA GCCCGCAACC CAAGGTCTAA
|
Protein sequence | MRPGLILLYE AGLALSRHAK RAAEGREKWR CRELGSEKEL ADPKTAGAKL AEWFARRAGL RGDLQGTAEE ALSLIERVDP LSAGAFSGVE EYAGRVREVW AGVSDPWDAP LLSPLRILEK MGYRELLSGS RAELDAEAPR RLLGDPEELK RYVGEAKGSR SWLAARPLPR EFTEILEALK LRSFEEAVRG SDYGEVAEAL LSMLGRAAEF YEALGSCNGV DDTLESVVSS TLALVPSHPL LPSVPLPALA RLHAALASAG EGFVLLGLDV NGIQDFVYAP VRESAASRVL RGRSLLVELV QFTATRIVLE LAGATWSAQL SKEGGAPTFV LPGSFRDKVD ALREILGRWM ARQFKGSLWL TVACSDPAPT ANAANPGGAF VAVSESYSTA VAFSKDSRFS GFAAELAALG NEDVEDLDSL TREPVLKGDG LKLRVERSTR DYAESLAPDK LSDGDLLGGL THISLACGSA ARNLAAVVSL YFFKGGRPCR EAAEAVLGAV FRERGRRLYF VLEGDQPLAV ALAPFTEAGA IHVLVAHTRR ELAKAEENLK AAETLLRVMK GYVKGAAGLA EKIHVEVRVV NAPRDFIPSA GVAEHYRGLP VAFSFAPYYT NPYHPVRAEE EARPIMGGER RVAEKVRLLD LDDMGVIAVS LLDADKMGDV SKSLGAYPAI FSAVSDFLSM GFGVKAYSAV LADAKKREGG RGILLYAAGD DLAVYGEWHD VLRILDIVRR EVLEGLLNPL AASSGMAVAD RKDPVLHVYD AARAAEREAK RRSQPGGSLA VSGICRKPIP MSGGELDLAR LAAVVGAALG EEAGEFESLK PLVYALAELA HEAEEAEEAL GKSPLRVARV VVHYRYLEAR RGKDFEKLSE LLGEHGVRLP GPKDPSNEVV RKLAALKPLF DFLALAARKP REASPQPKV
|
| |