Gene CPR_0108 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCPR_0108 
SymbolthiH 
ID4206173 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium perfringens SM101 
KingdomBacteria 
Replicon accessionNC_008262 
Strand
Start bp137986 
End bp139407 
Gene Length1422 bp 
Protein Length473 aa 
Translation table11 
GC content31% 
IMG OID642564663 
Productthiamine biosynthesis protein ThiH 
Protein accessionYP_697445 
Protein GI110803728 
COG category[H] Coenzyme transport and metabolism
[R] General function prediction only 
COG ID[COG1060] Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 
TIGRFAM ID[TIGR02351] thiazole biosynthesis protein ThiH 


Plasmid Coverage information

Num covering plasmid clones25 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGTTAAAAG ATAACGAAAA ATACAATGCT TTAGATTTTA TTAAGGATGA TGAAATTAAT 
AGTCTTATAG CAAAAGGAAA AGAACTTGTT TCTGACAAAG AACTAGTTAG AGAAATCATT
GAAAAAAGTA AAAGTGCAGA AGGGTTAACA CCTGAAGAAA CTGCTGTTCT TCTTAATCTT
GAAGATAAAG AACTTATTGA AGAAATGTTT AAAGCAGCTA GACAAGTTAA GGAAAAATTA
TATGGAAAAA GACTTGTTGT ATTTGCACCT TTATACGTAA GTAACTATTG CGTAAATAAC
TGTACTTACT GTGGATATAA ACACTGTAAT GATGAATTAA AAAGAAAGAA ACTTAACAAA
GAGCAACTTA TTGAAGAAGT TAAAGTTCTT GAAAGCCTAG GTCATAAGAG AATAGCACTT
GAAGCTGGAG AAGATCCAGT TAATGCTCCT CTAGATTATA TTTTAGATTG TATAAAATCA
ATATACTCAA TTAAATTTGA TAATGGGTCA ATAAGAAGAA TCAATGTTAA TATAGCTGCA
ACAACAGTTG AAGACTACAA GAGACTTAAA GATGCTGAAA TTGGAACATA TATTTTATTC
CAAGAAACTT ACCATAAACC AACTTATGAA AGATTACATG TAAGTGGTCC AAAACATAAT
TATAATTATC ATACAACAGC TATGCATAGA GCAAGAGAGG CTGGTATAGA TGATATTGGT
ATGGGAGTTT TATATGGACT TTATGATTAC AAGTATGAAA CATTAGCAAT GCTTATGCAC
GCAATGGATT TAGAAGAAAC TACAGGGGTT GGTCCTCATA CACTATCAGT TCCAAGAATA
AGACCAGCTG AAAATGTTAG CTTAGAAAAT TATCCTTACT TAGTTGATGA TGAAGACTTC
AAAAAAATTG TTGCCATCTT AAGATTAGCG GTACCATATG CAGGTCTTAT ACTTTCAACA
AGAGAAGAAC CAGGATTAAG AGATGAAATA ATAGCTCTTG GAGTTTCTCA AGTAAGTACA
GGTTCATGTA CAGGGGTTGG TGGTTATTCA GAATCATACG TAGATCCTGA AGAAAAACCA
CAATTTGAAG TTGAAGATCA TAGATCACCA GTTGAAATGA TAGAAAGCCT TATGGAAGCT
GGATATATAC CAAGTTATTG TACAGCTTGT TATAGAGAAG GTAGAACTGG CGAAAGATTT
ATGGAAATAG TTAAAAGTGG TGAACTTTAT AAAATATGTG AAGCCAATGC TTTAATAACT
TTAAAAGAAT TTATTGATGA TTACGGCACA GATAAAACAA GAGAAATCGG AGATAAATTA
ATTAAAAAAT CAATAGATGA AATAGATAAT GAATCCTTTA GAAAATCTGT TGAAGAAAAA
ATAAATAAGA TAAGTAAGGG AACTAGAGAT TTAAGATTCT AG
 
Protein sequence
MLKDNEKYNA LDFIKDDEIN SLIAKGKELV SDKELVREII EKSKSAEGLT PEETAVLLNL 
EDKELIEEMF KAARQVKEKL YGKRLVVFAP LYVSNYCVNN CTYCGYKHCN DELKRKKLNK
EQLIEEVKVL ESLGHKRIAL EAGEDPVNAP LDYILDCIKS IYSIKFDNGS IRRINVNIAA
TTVEDYKRLK DAEIGTYILF QETYHKPTYE RLHVSGPKHN YNYHTTAMHR AREAGIDDIG
MGVLYGLYDY KYETLAMLMH AMDLEETTGV GPHTLSVPRI RPAENVSLEN YPYLVDDEDF
KKIVAILRLA VPYAGLILST REEPGLRDEI IALGVSQVST GSCTGVGGYS ESYVDPEEKP
QFEVEDHRSP VEMIESLMEA GYIPSYCTAC YREGRTGERF MEIVKSGELY KICEANALIT
LKEFIDDYGT DKTREIGDKL IKKSIDEIDN ESFRKSVEEK INKISKGTRD LRF