Gene CPF_0106 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCPF_0106 
SymbolthiH 
ID4201863 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium perfringens ATCC 13124 
KingdomBacteria 
Replicon accessionNC_008261 
Strand
Start bp126740 
End bp128161 
Gene Length1422 bp 
Protein Length473 aa 
Translation table11 
GC content32% 
IMG OID638080987 
Productthiamine biosynthesis protein ThiH 
Protein accessionYP_694570 
Protein GI110799003 
COG category[H] Coenzyme transport and metabolism
[R] General function prediction only 
COG ID[COG1060] Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 
TIGRFAM ID[TIGR02351] thiazole biosynthesis protein ThiH 


Plasmid Coverage information

Num covering plasmid clones20 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGTTAAAAG ATAACGAAAA GTACAATGCT TTAGATTTTA TTAAGGATGA TGAAATTAAT 
AGTCTTATAG CAAAAGGAAA AGAGCTTGTT TCTGACAAAG AACTAGTTAG AGAAATCATT
GAAAAAAGTA AAAGTGCAGA AGGGTTAACA CCTGAAGAGA CTGCTGTTCT TCTTAATCTT
GAAGATAAAG AACTTATTGA AGAAATGTTT AAAGCAGCTA GACAAGTTAA GGAAAAATTA
TATGGAAAAA GACTTGTTGT ATTTGCACCT TTATACGTAA GTAACTATTG CGTAAATAAC
TGTACTTACT GTGGATATAA ACACTGTAAT GATGAGTTAA AAAGAAAGAA ACTTAACAAA
GAGCAACTTA TTGAAGAAGT TAAAGTTCTT GAAAGCCTAG GTCATAAGAG AATAGCACTT
GAAGCTGGGG AAGATCCAGT TAATGCTCCT CTAGATTATA TTTTAGATTG TATAAAATCA
ATATACTCAA TTAAATTTGA TAATGGGTCA ATAAGAAGAA TCAATGTTAA TATAGCTGCA
ACATCAGTTG AAGACTATAA GAGACTTAAA GATGCTGAGA TTGGAACATA TATCTTATTC
CAAGAAACTT ACCACAAACC AACTTATGAA AGATTACATG TAAGTGGTCC AAAACATAAC
TATAATTATC ATACAACAGC TATGCATAGA GCAAGAGAAG CTGGTATAGA TGATATTGGT
ATGGGAGTTC TATATGGACT TTATGATTAC AAGTATGAAA CATTAGCAAT GCTTATGCAT
GCAATGGATT TAGAAGAAAC TACAGGGGTT GGCCCTCATA CACTATCAGT TCCAAGAATA
AGACCAGCTG AAAATGTTAG CTTAGAAAAT TATCCTTACT TAGTTGATGA TGAAGACTTC
AAAAAGATTG TTGCTATCTT AAGATTAGCA GTACCATATG CAGGTCTTAT ACTTTCAACA
AGAGAAGAAC CAGGCTTAAG AGATGAAATA ATAGCTCTTG GAGTTTCTCA AGTAAGTACA
GGTTCATGTA CAGGTGTTGG TGGTTATTCA GAATCATACA TCGATCCTGA AGAGAAACCA
CAATTTGAAG TTGGAGATCA TAGATCACCA GTTGAAATGA TAGAAAGTCT TATGGAAGCT
GGATATATAC CAAGTTATTG CACAGCTTGT TATAGAGAAG GTAGAACTGG CGAAAGATTT
ATGGACATCG TTAAGAGTGG TGAACTTTAT AAAATATGTG AAGCCAATGC TTTAATAACT
TTAAAAGAAT TTATTGATGA TTACGGCACA GATAGAACAA GAGAAATCGG AGATAAATTA
ATTAAAAAAT CAATAGATGA AATAGATAAT GAATCATTTA GAAAATCTGT TGAAGAAAAA
ATAAATAAGA TAAGTAACGG AACTAGAGAT TTAAGATTCT AG
 
Protein sequence
MLKDNEKYNA LDFIKDDEIN SLIAKGKELV SDKELVREII EKSKSAEGLT PEETAVLLNL 
EDKELIEEMF KAARQVKEKL YGKRLVVFAP LYVSNYCVNN CTYCGYKHCN DELKRKKLNK
EQLIEEVKVL ESLGHKRIAL EAGEDPVNAP LDYILDCIKS IYSIKFDNGS IRRINVNIAA
TSVEDYKRLK DAEIGTYILF QETYHKPTYE RLHVSGPKHN YNYHTTAMHR AREAGIDDIG
MGVLYGLYDY KYETLAMLMH AMDLEETTGV GPHTLSVPRI RPAENVSLEN YPYLVDDEDF
KKIVAILRLA VPYAGLILST REEPGLRDEI IALGVSQVST GSCTGVGGYS ESYIDPEEKP
QFEVGDHRSP VEMIESLMEA GYIPSYCTAC YREGRTGERF MDIVKSGELY KICEANALIT
LKEFIDDYGT DRTREIGDKL IKKSIDEIDN ESFRKSVEEK INKISNGTRD LRF