Gene OSTLU_119527 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagOSTLU_119527 
SymbolTho2 
ID5000400 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameOstreococcus lucimarinus CCE9901 
KingdomEukaryota 
Replicon accessionNC_009356 
Strand
Start bp489201 
End bp492320 
Gene Length3120 bp 
Protein Length1039 aa 
Translation table 
GC content43% 
IMG OID640415821 
Producttranscription & nuclear export THO-TREX component protein Tho2-like protein 
Protein accessionXP_001416161 
Protein GI145342232 
COG category 
COG ID 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.00206134 
Plasmid hitchhikingNo 
Plasmid clonabilitydecreased coverage 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGTCGAACA CAGATGATCG CTTTGCATCT CTCATGCAAA TTTCACGCGA TGACCCCAGT 
AATTCAGATG ATTACTTTTG GGTGCGCTTG AGTGCCGCGG AAAGCTCGGC GCTCAGTGTC
GATCCGTTGA CCCTTCAGAA CGAGATACAT GCGGATCATG CGATGGTCAA TGGGCTTTTG
CTGAATGCAG ACACAAATTT CATAGAAGCA CAGCGATTAA GCATCGGTTC CTGGTCCAGA
AAAGAGTTGA GAACGAACAC GAAGATCACG TACCAACTAC CGAAATCGGA TTTACTTGCA
GAAAATCTCG AAGGTTATGC ACGAGCGCTG TCAACACTTA GAATGGATTC GCTCACGACG
TCAGGAAGAC GTCACTCGAA ATTGATCGTC GAAAGCCACA CTCTTATGGG CCTTTTCAAG
CTAAACCCTC TCCGGGTGTT CGATTTAGTC TTTGACTCAG TCAGAGAACT TGCTGACTCC
AAACACGGCG ATGCTTTTGC AACATCGTTG GAATCTAGTA CTCATTTAAG CGATATAATA
GGGTTTCGCT TGACAAGGTG CGCCGATTTC AAACGACGAC ACAAAATCGC CCAGTTCGAG
CTGAAAATCG CTGCTGAAAA AGAGAACACT GAAATCGGTA GACTGTATCC TTACCTGCTC
CCAGTTGACC GATCTCTGTA TGATAAAGCC CGAAGCTTGC AAGGTGTATC GCCCGTAGAC
GAGTTATACT CTGATGAGCC AAAGTTTTTG TTTCTGGAAT CGACTGGAAG TAAAGCCGGT
TTGTTGCCTA TTCAAGCGTC GGTCGAAAGG ATCATCAGAC TGGGTTGTGG TACTCATGCC
CTGCCGAAAC TTCGGGCGAA AATGTGTGAG CAACTTAAAA AGAGAATACA TGACGAGAAG
GAAAAACATG ACGAGAACCA GAAATTGGAA TCTATTCTCC GCATCGTCTT CTTTCTTGGA
CCAAATGTAT GCTGTGATGC AGCTCTAACG ACTGAACTTC TTAAGTACTT CCAAAGAGTT
GTCGCGAATA AAAGAGAGCA TTTGAAATTG ATCGAACCTG CTATTGCAAA GTGTTTGCTA
CCAGGATTAC CACTTCTGAG GCCGAATCCT TTAATATGTG AATACATCTG GAAAGTTGTC
CATCAACTTC CGCTTGCATC ACGCGGAAAA ATATACAAAC ACGCCATGTT TGTCACGTGT
AGCACTCGAC TGACTCAGAT AAAGAGTTCA GCCGATAAAT CAATAATGAG GATCTTACGA
AGATTGTCAA ATCAAAATGC AAAAGAGCTC GGCCGTAAAC TTGGTAAGCT CGCACTTTGT
CACCCTTTAA CGGTGACGAA AGCAGTACTG CAACAGGTTC AGGCGTATCC AAACATGATT
GCACCGATCG TGCAAGCTTT GAAATACTGC ACCGCACTAA CTTTCGATAC TTTGACCTTT
TCTGTCCTCG AACATTTTAC CGAGACAAAA TCCAAACTCA AAGAAGATGG CCAAAACTTT
TCGCTATGGT TTTCTGCTTT GTGTTCTTTT TCTGGAACAT TGCTCAGACA ATACCATGAC
ATTGAACTTA ATGACCTATT GCGGTACATT TTGAAAAAAT TAGACGAAGG TGAAGTGTTG
GATTTATTGA TATTAAAGGA TCTTATATCG AAATTGACTG GTATCGAAAT CTTGAAGGAT
GTCAGTTCGC TTGAGATTCA AAGAATGGCG GCTGGAGAAA AACTTCGAGG ACTTCACGAG
AACTCTACGA GCACTGACTC AAAACGACGG ATGAAAGGAA TTTTGCGTTT ACGCACAGCA
ATTGAAGGGT CGGACGCTTC GAAGAGTATG ATGCTTCGAT TACTATTAGC AATAGCGAAA
TGTCGCCAAC GAATTGTTCT GAAGACGGCA AGTGCTCAAA TGAAATTTTT GGGTCAACTT
TTTGATGAAT GTCACGTTGT TCTATTGCAG TATATTTCGT TTCTTCACAC AGCTTACAGT
GTAGCAGAGC TCAAGAACTC GCTGCCAGAG ATCCAATCTC TTCTGTGTGA ACACCAACTT
GATGCTGCTG TCGTTTTCCA TTTGTATCGT CCAATTCTTC GGCAAACACT ATTTTCATCT
CTGGATACAA GCAGCTCCCT TGCTGAAAGT CACAATCAAT ACTGGAAGAA ACTAAGCAGA
GATGTTAAGG CAAGTCTCCA TTCTGACGAG CTTTGGGGAT TGGTATCGCC AGAATTCTAC
CTCACTTTTT GGTCGCTCAA TCTCGAGGAT GTTTACGTAC CCAAAGAGCT GTATGCAAAA
GCGCGCGCGC GGACTGTGAA ATATGGTTCA ACTGCATCGT CAACGTTTGA TCAAGTTACT
CAATTGCCTA GTAATGTTGT TCACGACGAA CTGGCCGAGA GTTTAAACGT AGAGCTGCAA
CAACTTATTG AACATGTTTC CAGAGGTACA CAGTGTATTC TGAGAAAGAA CTCAACCTGG
GTGCAGACTG CGGATAATTT GTCTCCTGCG AGTGTCATTT TGGAAACATG TGTACTACCA
AGGTGCAAAA TCTCACACGC TGATGCGTTG TATTGCTCAA ATTTTGTGGA GCTTGTTGGT
CGAAGTGACA CAGATTCCTT CAGTTTTATT CAGTATTACG ATACACTATT CCGTAACGTA
TCATATCTCA TCTATGCTTG CAGTGAATTT GAAAACAAAT TTTTTGCTAC TTTCTTGACG
AGTGGTATGG CTCAGCTCAC ACGTTGGAGA GCGTGTGATG TATATAAGAG AGAGTGCTGG
CAGCGCTCTG AATTTCGTAA TTGCTTGAAT CAGAGCAGTG TCGAAGCAAG TTTCAAAGAC
TACTTGAGGG TTCTGTACAA ATGGGAGCAT AGGTTGACGA AAGGCATCCT GCGTAATCTA
CAGAACGAAG ACTACATGGA AGTCGCCAAT TCGTTGAGCT CTCTAATAGC AATCGTTGAT
ACATTTCCTG TCACTGAACT TTTAGGCAAC TATATCTACA GCCAAGTCAA AAGAATCAGT
ATGAGGGACA AGAGGGACGA CATCAAAACA ATATCGAACA GATACCTATC ACTTCTGAAT
TTAAAACGGG CAAGTTGGAA ATGTGTTCCA CCCGAGGACG ACAGTGCAAG TCAGAGTTGA
 
Protein sequence
MSNTDDRFAS LMQISRDDPS NSDDYFWVRL SAAESSALSV DPLTLQNEIH ADHAMVNGLL 
LNADTNFIEA QRLSIGSWSR KELRTNTKIT YQLPKSDLLA ENLEGYARAL STLRMDSLTT
SGRRHSKLIV ESHTLMGLFK LNPLRVFDLV FDSVRELADS KHGDAFATSL ESSTHLSDII
GFRLTRCADF KRRHKIAQFE LKIAAEKENT EIGRLYPYLL PVDRSLYDKA RSLQGVSPVD
ELYSDEPKFL FLESTGSKAG LLPIQASVER IIRLGCGTHA LPKLRAKMCE QLKKRIHDEK
EKHDENQKLE SILRIVFFLG PNVCCDAALT TELLKYFQRV VANKREHLKL IEPAIAKCLL
PGLPLLRPNP LICEYIWKVV HQLPLASRGK IYKHAMFVTC STRLTQIKSS ADKSIMRILR
RLSNQNAKEL GRKLGKLALC HPLTVTKAVL QQVQAYPNMI APIVQALKYC TALTFDTLTF
SVLEHFTETK SKLKEDGQNF SLWFSALCSF SGTLLRQYHD IELNDLLRYI LKKLDEGEVL
DLLILKDLIS KLTGIEILKD VSSLEIQRMA AGEKLRGLHE NSTSTDSKRR MKGILRLRTA
IEGSDASKSM MLRLLLAIAK CRQRIVLKTA SAQMKFLGQL FDECHVVLLQ YISFLHTAYS
VAELKNSLPE IQSLLCEHQL DAAVVFHLYR PILRQTLFSS LDTSSSLAES HNQYWKKLSR
DVKASLHSDE LWGLVSPEFY LTFWSLNLED VYVPKELYAK ARARTVKYGS TASSTFDQVT
QLPSNVVHDE LAESLNVELQ QLIEHVSRGT QCILRKNSTW VQTADNLSPA SVILETCVLP
RCKISHADAL YCSNFVELVG RSDTDSFSFI QYYDTLFRNV SYLIYACSEF ENKFFATFLT
SGMAQLTRWR ACDVYKRECW QRSEFRNCLN QSSVEASFKD YLRVLYKWEH RLTKGILRNL
QNEDYMEVAN SLSSLIAIVD TFPVTELLGN YIYSQVKRIS MRDKRDDIKT ISNRYLSLLN
LKRASWKCVP PEDDSASQS