Gene NATL1_20631 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagNATL1_20631 
SymbolthiC 
ID4780043 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameProchlorococcus marinus str. NATL1A 
KingdomBacteria 
Replicon accessionNC_008819 
Strand
Start bp1708157 
End bp1709557 
Gene Length1401 bp 
Protein Length466 aa 
Translation table11 
GC content39% 
IMG OID640085359 
Productthiamine biosynthesis protein ThiC 
Protein accessionYP_001015883 
Protein GI124026768 
COG category[H] Coenzyme transport and metabolism 
COG ID[COG0422] Thiamine biosynthesis protein ThiC 
TIGRFAM ID[TIGR00190] thiamine biosynthesis protein ThiC 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.017014 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones19 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCGGAATT CATGGGTGGC TTCGAGAAAA GGTAAAACCA ATGTTTCTCA GATGCATTTT 
GCTCGCAAAG GCGAAATTAC TGAAGAAATG AGGTATGTGG CAAAGCGTGA GAATCTTCCT
GAGTCTCTGG TTATGGAAGA AGTCGCGCGC GGTCGAATGA TTATTCCTGC AAATATTAAC
CATATAAACT TAGAGCCGAT GGCAATAGGT ATTGCCTCAA CATGTAAAGT CAATGCAAAT
ATTGGTGCTT CACCAAATGC AAGCGATATC AGTGAAGAAT TAAAGAAGCT TGATCTAGCA
GTAAAATATG GGGCTGATAC TCTTATGGAT CTTTCTACTG GAGGGGTTAA TTTAGATGAG
GTGCGAACTG AAATTATCAA TGCCTCCCCT ATCCCGATAG GGACAGTTCC TGTTTATCAA
GCTTTAGAGA GTGTTCACGG TTCTATTTCA AGGTTAAATG AGGATGATTT TTTACACATT
ATAGAAAAGC ATTGTCAGCA GGGAGTTGAT TATCAAACCA TTCATGCAGG CTTATTGATT
GAACATTTAC CCAAGGTTAA AGGTCGTATT ACTGGAATAG TTAGTCGTGG CGGAGGAATT
CTTGCCCAAT GGATGCTTTA TCACTACAAA CAAAATCCTT TGTTTACTCG TTTTGATGAT
ATTTGTGAAA TTTTTAAACG CTATGACTGC ACCTTTTCTT TAGGTGACTC TCTAAGGCCT
GGATGTCTGC ATGATGCATC AGATGAAGCT CAACTCGCTG AGTTGAAAAC TCTAGGTGAA
TTGACTAGAC GTGCTTGGAA GCATGATGTT CAAGTCATGG TTGAAGGGCC TGGTCATGTA
CCTATGGATC AAATCGAATT CAATGTTAGG AAGCAAATGG AGGAGTGTTC AGAAGCTCCC
TTTTATGTTC TAGGTCCATT GGTAACAGAT ATTTCTCCTG GTTATGATCA CATTTCAAGT
GCTATTGGTG CAGCCATGGC AGGTTGGTAC GGGACTGCGA TGCTTTGTTA TGTAACACCT
AAGGAACATC TTGGGTTGCC TAATCCTGAG GATGTTAGAG AAGGTTTAAT TGCTTATAAA
ATTGCTGCTC ATGCCGCAGA TGTCGCAAGA CATAGATCAG GAGCACGTGA TCGTGATGAT
GAATTAAGTA AGGCTCGTAA AGAATTTGAC TGGAACAAAC AGTTTGAATT GTCCTTAGAT
CCAGAAAAAG CTAAGCAATA TCATGACGAA ACTTTACCTG AAGAAATTTT CAAGAAAGCA
GAGTTTTGTT CAATGTGTGG TCCTAATCAT TGTCCAATGA ATACAAAAAT CACAGATGAA
GATCTTGATC AATTAAACGA TCAAATACAG TCAAAAGGTG CAGCTGAATT AACTCCAGTA
AAGTTAAACA AAGAAAACTA G
 
Protein sequence
MRNSWVASRK GKTNVSQMHF ARKGEITEEM RYVAKRENLP ESLVMEEVAR GRMIIPANIN 
HINLEPMAIG IASTCKVNAN IGASPNASDI SEELKKLDLA VKYGADTLMD LSTGGVNLDE
VRTEIINASP IPIGTVPVYQ ALESVHGSIS RLNEDDFLHI IEKHCQQGVD YQTIHAGLLI
EHLPKVKGRI TGIVSRGGGI LAQWMLYHYK QNPLFTRFDD ICEIFKRYDC TFSLGDSLRP
GCLHDASDEA QLAELKTLGE LTRRAWKHDV QVMVEGPGHV PMDQIEFNVR KQMEECSEAP
FYVLGPLVTD ISPGYDHISS AIGAAMAGWY GTAMLCYVTP KEHLGLPNPE DVREGLIAYK
IAAHAADVAR HRSGARDRDD ELSKARKEFD WNKQFELSLD PEKAKQYHDE TLPEEIFKKA
EFCSMCGPNH CPMNTKITDE DLDQLNDQIQ SKGAAELTPV KLNKEN