Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | OSTLU_119527 |
Symbol | Tho2 |
ID | 5000400 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Ostreococcus lucimarinus CCE9901 |
Kingdom | Eukaryota |
Replicon accession | NC_009356 |
Strand | + |
Start bp | 489201 |
End bp | 492320 |
Gene Length | 3120 bp |
Protein Length | 1039 aa |
Translation table | |
GC content | 43% |
IMG OID | 640415821 |
Product | transcription & nuclear export THO-TREX component protein Tho2-like protein |
Protein accession | XP_001416161 |
Protein GI | 145342232 |
COG category | |
COG ID | |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 6 |
Plasmid unclonability p-value | 0.00206134 |
Plasmid hitchhiking | No |
Plasmid clonability | decreased coverage |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGTCGAACA CAGATGATCG CTTTGCATCT CTCATGCAAA TTTCACGCGA TGACCCCAGT AATTCAGATG ATTACTTTTG GGTGCGCTTG AGTGCCGCGG AAAGCTCGGC GCTCAGTGTC GATCCGTTGA CCCTTCAGAA CGAGATACAT GCGGATCATG CGATGGTCAA TGGGCTTTTG CTGAATGCAG ACACAAATTT CATAGAAGCA CAGCGATTAA GCATCGGTTC CTGGTCCAGA AAAGAGTTGA GAACGAACAC GAAGATCACG TACCAACTAC CGAAATCGGA TTTACTTGCA GAAAATCTCG AAGGTTATGC ACGAGCGCTG TCAACACTTA GAATGGATTC GCTCACGACG TCAGGAAGAC GTCACTCGAA ATTGATCGTC GAAAGCCACA CTCTTATGGG CCTTTTCAAG CTAAACCCTC TCCGGGTGTT CGATTTAGTC TTTGACTCAG TCAGAGAACT TGCTGACTCC AAACACGGCG ATGCTTTTGC AACATCGTTG GAATCTAGTA CTCATTTAAG CGATATAATA GGGTTTCGCT TGACAAGGTG CGCCGATTTC AAACGACGAC ACAAAATCGC CCAGTTCGAG CTGAAAATCG CTGCTGAAAA AGAGAACACT GAAATCGGTA GACTGTATCC TTACCTGCTC CCAGTTGACC GATCTCTGTA TGATAAAGCC CGAAGCTTGC AAGGTGTATC GCCCGTAGAC GAGTTATACT CTGATGAGCC AAAGTTTTTG TTTCTGGAAT CGACTGGAAG TAAAGCCGGT TTGTTGCCTA TTCAAGCGTC GGTCGAAAGG ATCATCAGAC TGGGTTGTGG TACTCATGCC CTGCCGAAAC TTCGGGCGAA AATGTGTGAG CAACTTAAAA AGAGAATACA TGACGAGAAG GAAAAACATG ACGAGAACCA GAAATTGGAA TCTATTCTCC GCATCGTCTT CTTTCTTGGA CCAAATGTAT GCTGTGATGC AGCTCTAACG ACTGAACTTC TTAAGTACTT CCAAAGAGTT GTCGCGAATA AAAGAGAGCA TTTGAAATTG ATCGAACCTG CTATTGCAAA GTGTTTGCTA CCAGGATTAC CACTTCTGAG GCCGAATCCT TTAATATGTG AATACATCTG GAAAGTTGTC CATCAACTTC CGCTTGCATC ACGCGGAAAA ATATACAAAC ACGCCATGTT TGTCACGTGT AGCACTCGAC TGACTCAGAT AAAGAGTTCA GCCGATAAAT CAATAATGAG GATCTTACGA AGATTGTCAA ATCAAAATGC AAAAGAGCTC GGCCGTAAAC TTGGTAAGCT CGCACTTTGT CACCCTTTAA CGGTGACGAA AGCAGTACTG CAACAGGTTC AGGCGTATCC AAACATGATT GCACCGATCG TGCAAGCTTT GAAATACTGC ACCGCACTAA CTTTCGATAC TTTGACCTTT TCTGTCCTCG AACATTTTAC CGAGACAAAA TCCAAACTCA AAGAAGATGG CCAAAACTTT TCGCTATGGT TTTCTGCTTT GTGTTCTTTT TCTGGAACAT TGCTCAGACA ATACCATGAC ATTGAACTTA ATGACCTATT GCGGTACATT TTGAAAAAAT TAGACGAAGG TGAAGTGTTG GATTTATTGA TATTAAAGGA TCTTATATCG AAATTGACTG GTATCGAAAT CTTGAAGGAT GTCAGTTCGC TTGAGATTCA AAGAATGGCG GCTGGAGAAA AACTTCGAGG ACTTCACGAG AACTCTACGA GCACTGACTC AAAACGACGG ATGAAAGGAA TTTTGCGTTT ACGCACAGCA ATTGAAGGGT CGGACGCTTC GAAGAGTATG ATGCTTCGAT TACTATTAGC AATAGCGAAA TGTCGCCAAC GAATTGTTCT GAAGACGGCA AGTGCTCAAA TGAAATTTTT GGGTCAACTT TTTGATGAAT GTCACGTTGT TCTATTGCAG TATATTTCGT TTCTTCACAC AGCTTACAGT GTAGCAGAGC TCAAGAACTC GCTGCCAGAG ATCCAATCTC TTCTGTGTGA ACACCAACTT GATGCTGCTG TCGTTTTCCA TTTGTATCGT CCAATTCTTC GGCAAACACT ATTTTCATCT CTGGATACAA GCAGCTCCCT TGCTGAAAGT CACAATCAAT ACTGGAAGAA ACTAAGCAGA GATGTTAAGG CAAGTCTCCA TTCTGACGAG CTTTGGGGAT TGGTATCGCC AGAATTCTAC CTCACTTTTT GGTCGCTCAA TCTCGAGGAT GTTTACGTAC CCAAAGAGCT GTATGCAAAA GCGCGCGCGC GGACTGTGAA ATATGGTTCA ACTGCATCGT CAACGTTTGA TCAAGTTACT CAATTGCCTA GTAATGTTGT TCACGACGAA CTGGCCGAGA GTTTAAACGT AGAGCTGCAA CAACTTATTG AACATGTTTC CAGAGGTACA CAGTGTATTC TGAGAAAGAA CTCAACCTGG GTGCAGACTG CGGATAATTT GTCTCCTGCG AGTGTCATTT TGGAAACATG TGTACTACCA AGGTGCAAAA TCTCACACGC TGATGCGTTG TATTGCTCAA ATTTTGTGGA GCTTGTTGGT CGAAGTGACA CAGATTCCTT CAGTTTTATT CAGTATTACG ATACACTATT CCGTAACGTA TCATATCTCA TCTATGCTTG CAGTGAATTT GAAAACAAAT TTTTTGCTAC TTTCTTGACG AGTGGTATGG CTCAGCTCAC ACGTTGGAGA GCGTGTGATG TATATAAGAG AGAGTGCTGG CAGCGCTCTG AATTTCGTAA TTGCTTGAAT CAGAGCAGTG TCGAAGCAAG TTTCAAAGAC TACTTGAGGG TTCTGTACAA ATGGGAGCAT AGGTTGACGA AAGGCATCCT GCGTAATCTA CAGAACGAAG ACTACATGGA AGTCGCCAAT TCGTTGAGCT CTCTAATAGC AATCGTTGAT ACATTTCCTG TCACTGAACT TTTAGGCAAC TATATCTACA GCCAAGTCAA AAGAATCAGT ATGAGGGACA AGAGGGACGA CATCAAAACA ATATCGAACA GATACCTATC ACTTCTGAAT TTAAAACGGG CAAGTTGGAA ATGTGTTCCA CCCGAGGACG ACAGTGCAAG TCAGAGTTGA
|
Protein sequence | MSNTDDRFAS LMQISRDDPS NSDDYFWVRL SAAESSALSV DPLTLQNEIH ADHAMVNGLL LNADTNFIEA QRLSIGSWSR KELRTNTKIT YQLPKSDLLA ENLEGYARAL STLRMDSLTT SGRRHSKLIV ESHTLMGLFK LNPLRVFDLV FDSVRELADS KHGDAFATSL ESSTHLSDII GFRLTRCADF KRRHKIAQFE LKIAAEKENT EIGRLYPYLL PVDRSLYDKA RSLQGVSPVD ELYSDEPKFL FLESTGSKAG LLPIQASVER IIRLGCGTHA LPKLRAKMCE QLKKRIHDEK EKHDENQKLE SILRIVFFLG PNVCCDAALT TELLKYFQRV VANKREHLKL IEPAIAKCLL PGLPLLRPNP LICEYIWKVV HQLPLASRGK IYKHAMFVTC STRLTQIKSS ADKSIMRILR RLSNQNAKEL GRKLGKLALC HPLTVTKAVL QQVQAYPNMI APIVQALKYC TALTFDTLTF SVLEHFTETK SKLKEDGQNF SLWFSALCSF SGTLLRQYHD IELNDLLRYI LKKLDEGEVL DLLILKDLIS KLTGIEILKD VSSLEIQRMA AGEKLRGLHE NSTSTDSKRR MKGILRLRTA IEGSDASKSM MLRLLLAIAK CRQRIVLKTA SAQMKFLGQL FDECHVVLLQ YISFLHTAYS VAELKNSLPE IQSLLCEHQL DAAVVFHLYR PILRQTLFSS LDTSSSLAES HNQYWKKLSR DVKASLHSDE LWGLVSPEFY LTFWSLNLED VYVPKELYAK ARARTVKYGS TASSTFDQVT QLPSNVVHDE LAESLNVELQ QLIEHVSRGT QCILRKNSTW VQTADNLSPA SVILETCVLP RCKISHADAL YCSNFVELVG RSDTDSFSFI QYYDTLFRNV SYLIYACSEF ENKFFATFLT SGMAQLTRWR ACDVYKRECW QRSEFRNCLN QSSVEASFKD YLRVLYKWEH RLTKGILRNL QNEDYMEVAN SLSSLIAIVD TFPVTELLGN YIYSQVKRIS MRDKRDDIKT ISNRYLSLLN LKRASWKCVP PEDDSASQS
|
| |