Gene Acid345_2753 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagAcid345_2753 
SymbolthiH 
ID4069444 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameCandidatus Koribacter versatilis Ellin345 
KingdomBacteria 
Replicon accessionNC_008009 
Strand
Start bp3260567 
End bp3261964 
Gene Length1398 bp 
Protein Length465 aa 
Translation table11 
GC content59% 
IMG OID637984770 
Productthiamine biosynthesis protein ThiH 
Protein accessionYP_591828 
Protein GI94969780 
COG category[H] Coenzyme transport and metabolism
[R] General function prediction only 
COG ID[COG1060] Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 
TIGRFAM ID[TIGR02351] thiazole biosynthesis protein ThiH 


Plasmid Coverage information

Num covering plasmid clones26 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones13 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAACATCA AGTCTGTCGC AGATTTCATC AAGGAGCGCG AGATCGAGCA GGCCCTGAAA 
CTGGCCGCCC GCGCCGACAG GAGCATGGTG CAGGACTGCA TCCAGAAAGC AGTCTCCATG
CAGGCACTCA CATTGCGAGA GGTCGCGGTG CTCATGTGCG CCGACGGCGA TCTGCGTGAA
GAATTGTACG AGGCTGCTCG CTTCGTGAAG AACGAGATCT ACGGTTCGCG GCTGGTGCTT
TTCGCCCCGC TCTATATCTC GAACCTCTGC ACCAATGAAT GCTCCTACTG CGCGTTCCGG
AAAGACAATA AGGAAGTCCG TCGCCGTTGG CTTTCGCAGG AAGAAATCGC GCAGGAGACA
CGCATCCTCA TCAACCAGGG ATACAAGCGG ACCCTGCTGG TCGCCGGCGA GGCGTATCCG
AAGAACGATT TCAATTACGT TCTCGAGTCC ATCGCCACCA TCTATGCGAC CAAGTCCGCC
AAGGGCGAGA TCCGCCGCGT ACACGCGAAT GTCGCGCCGC CTACGGTCGA ACAGTTCTAC
CAGTTGCGGG AAGCCAAGCT CGGCGTCTAT CAGTGCTTCC AGGAGAGCTA TCACCGTCCG
ACCTACGCGG CGGTGCACAA GGCCGGCAAG AAGGCCGATT ACGATTGGCG AGCGGCGGTC
ATGCATCGTG CCATGGCCAG CGGCGTTGGC GACGTCGGCA TGGGCGTGCT CTACGGCCTC
TACGACTGGC GTTGGGAGAC CTTGGCGCTC ATGCAGCACA TTCGCGATCT GGAACGAACC
TTCGGAGCAG GCTCGCACAC CATCAGCGTG CCGCGGATTG AGCCGGCCGT AGGTTCGAAG
CTCGCCACAC GCCCACCAAA TGCCGTGACC GACGACGACT TCCTGAAGAT CATTGCAGTC
CTGCGATTAG CGGTTCCGTA TACCGGGCTG GTGATGTCCA CTCGCGAACC GGCCGAAATT
CGTCGCAAGA GTCTGGAAAT CGGCATCTCG CAGATGTCCG CCGGGAGCCG CACCGATCCC
GGCGGCTATT CCGAGAGCAC CATCGACAAA GATGCCGGGC AGTTCGTGGT TGGTGATCAC
CGTCCACTGG ATGAGATCGT GAAAGAAGTA GCGGAGATGG GATTTATCCC GTCATTCTGC
ACAGCCTGCT ATCGCGTTGG CCGCACCGGG TCGGACTTCA AAAACCTCGC CAGTCATCCC
CACGTAATGT CGCGCAACTG CGAGACCAAC GCCCTCACCA CGCTGCTTGA ATATCTCTGC
GACTATTCCG TCTCCAACAG AGTTGCGGGT GAACACATGA TCGAGATGCA GTTGGAGAAG
ATGGCGCCGA CGCAGCGAGC GTTTGTCACA CAGATGCTAA ATCGAATTAA ACGCGGGGAG
CGAGACGTCT TCGTTTAG
 
Protein sequence
MNIKSVADFI KEREIEQALK LAARADRSMV QDCIQKAVSM QALTLREVAV LMCADGDLRE 
ELYEAARFVK NEIYGSRLVL FAPLYISNLC TNECSYCAFR KDNKEVRRRW LSQEEIAQET
RILINQGYKR TLLVAGEAYP KNDFNYVLES IATIYATKSA KGEIRRVHAN VAPPTVEQFY
QLREAKLGVY QCFQESYHRP TYAAVHKAGK KADYDWRAAV MHRAMASGVG DVGMGVLYGL
YDWRWETLAL MQHIRDLERT FGAGSHTISV PRIEPAVGSK LATRPPNAVT DDDFLKIIAV
LRLAVPYTGL VMSTREPAEI RRKSLEIGIS QMSAGSRTDP GGYSESTIDK DAGQFVVGDH
RPLDEIVKEV AEMGFIPSFC TACYRVGRTG SDFKNLASHP HVMSRNCETN ALTTLLEYLC
DYSVSNRVAG EHMIEMQLEK MAPTQRAFVT QMLNRIKRGE RDVFV