Gene P9301_18041 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagP9301_18041 
SymbolthiC 
ID4911553 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameProchlorococcus marinus str. MIT 9301 
KingdomBacteria 
Replicon accessionNC_009091 
Strand
Start bp1526173 
End bp1527543 
Gene Length1371 bp 
Protein Length456 aa 
Translation table11 
GC content38% 
IMG OID640161408 
Productthiamine biosynthesis protein ThiC 
Protein accessionYP_001092028 
Protein GI126697142 
COG category[H] Coenzyme transport and metabolism 
COG ID[COG0422] Thiamine biosynthesis protein ThiC 
TIGRFAM ID[TIGR00190] thiamine biosynthesis protein ThiC 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAGAAGTT CTTGGATTAA GCCTCGCCTT GGGAAAGACA ATGTAACTCA GATGAACTTT 
GCGAGAAACG GATATATCAC CGAAGAAATG GATTTTGTTG CTAAAAAAGA GAATCTACCT
CCTTCTTTAA TAATGGAGGA AGTGGCAAGA GGAAGATTAA TTATTCCAGC TAATATTAAT
CATTTGAATC TTGAGCCAAT GTCTATAGGG GTTGCTTCTC GATGCAAAGT TAATGCCAAT
ATTGGTGCTT CTCCCAATGC AAGTGATATA AATGAAGAAG TAGAAAAGCT TAAACTAGCT
GTAAAATATG GTGCTGATAC GGTTATGGAT CTTTCTACGG GAGGAGTAAA TTTAGATGAA
GTGCGGCAAG CAATTATTCA AGAATCTCCA GTTCCCATAG GAACTGTTCC TGTTTATCAA
GCATTAGAAA GTGTACATGG TTCAATCGAT AGACTAACAG AAGACGATTT TCTTCATATT
ATTGAAAAAC ATTGCCAGCA AGGAGTAGAT TATCAAACTA TTCATGCTGG TCTATTAATA
GAGCATTTAC CAAAAGTTAA AGGAAGAATC ACTGGAATTG TCAGTAGAGG GGGAGGTATT
TTAGCTCAAT GGATGCTACA TCATTTTAAG CAAAATCCCC TCTATACAAG GTTTGATGAT
ATCTGTGAGA TTTTTAAGAA ATATGATTGT ACTTTTTCTC TAGGAGATTC ACTAAGGCCT
GGATGTTTGC ATGATGCTTC TGATGATGCT CAACTAGCTG AATTGAAGAC CTTAGGTGAG
CTTACTCGAA GAGCATGGGA ACATAATGTT CAAGTAATGG TTGAGGGCCC TGGTCATGTA
CCTATGGACC AAATTGAGTT TAATGTGAGA AAGCAAATGG AAGAATGTTC AGAAGCTCCT
TTCTATGTAC TTGGTCCATT AGTTACAGAT ATTTCTCCTG GTTATGACCA TATTTCAAGT
GCTATTGGGG CGGCTATGGC AGGATGGTAT GGAACGTCTA TGTTATGTTA CGTAACCCCA
AAAGAACATC TAGGCCTCCC AAATGCAGAA GATGTACGAG AAGGATTAAT TGCTTATAAA
ATAGCCGCTC ACGCTGCTGA TATAGCAAGA CATAGAGCTG GTGCTCGTGA TAGAGATGAT
GAACTTAGTC ATGCAAGGTA TAACTTTGAT TGGAATAAAC AATTCGAACT TTCTTTAGAT
CCAGAGAGGG CAAAGCAGTA CCATGATGAA ACACTACCTG AAGAAATCTT TAAAAAGGCT
GAGTTTTGTT CAATGTGTGG TCCAAAACAT TGTCCAATGA ATTCAAAGAT TTCAGATGAA
TCTCTTGATC AGTTAAAAGA TAAACTTGAA GAATGTAGTA CTTCAGCTTA G
 
Protein sequence
MRSSWIKPRL GKDNVTQMNF ARNGYITEEM DFVAKKENLP PSLIMEEVAR GRLIIPANIN 
HLNLEPMSIG VASRCKVNAN IGASPNASDI NEEVEKLKLA VKYGADTVMD LSTGGVNLDE
VRQAIIQESP VPIGTVPVYQ ALESVHGSID RLTEDDFLHI IEKHCQQGVD YQTIHAGLLI
EHLPKVKGRI TGIVSRGGGI LAQWMLHHFK QNPLYTRFDD ICEIFKKYDC TFSLGDSLRP
GCLHDASDDA QLAELKTLGE LTRRAWEHNV QVMVEGPGHV PMDQIEFNVR KQMEECSEAP
FYVLGPLVTD ISPGYDHISS AIGAAMAGWY GTSMLCYVTP KEHLGLPNAE DVREGLIAYK
IAAHAADIAR HRAGARDRDD ELSHARYNFD WNKQFELSLD PERAKQYHDE TLPEEIFKKA
EFCSMCGPKH CPMNSKISDE SLDQLKDKLE ECSTSA