Gene EcSMS35_0459 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_0459 
SymbolthiI 
ID6146108 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp467580 
End bp469028 
Gene Length1449 bp 
Protein Length482 aa 
Translation table11 
GC content52% 
IMG OID641615353 
Productthiamine biosynthesis protein ThiI 
Protein accessionYP_001742560 
Protein GI170679960 
COG category[H] Coenzyme transport and metabolism 
COG ID[COG0301] Thiamine biosynthesis ATP pyrophosphatase 
TIGRFAM ID[TIGR00342] thiazole biosynthesis/tRNA modification protein ThiI 


Plasmid Coverage information

Num covering plasmid clones22 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones48 
Fosmid unclonability p-value0.882891 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAAGTTTA TCATTAAATT GTTCCCGGAA ATCACCATCA AAAGCCAATC TGTGCGCTTG 
CGCTTTATAA AAATCCTTAC CGGGAACATT CGTAACGTTT TAAAGCACTA TGATGAGACG
CTCGCTGTCG TCCGCCACTG GGATAACATC GAAGTTCGCG CAAAAGATGA AAACCAGCGT
CTGGCCATTC GCGACGCCTT GACCCGTATT CCGGGTATCC ACCATATTCT CGAAGTCGAA
GACGTGCCGT TTACCGACAT GCACGATATT TTCGAGAAAG CGTTGGTTCA GTATCGCGAT
CAGTTGGAAG GCAAAACCTT CTGCGTGCGC GTGAAGCGCC GTGGCAAACA TGATTTTAGC
TCGATTGATG TGGAGCGTTA CGTCGGCGGC GGTTTAAATC AGCATATTGA ATCCGCGCGC
GTGAAGTTGA CCAATCCGGA TGTGACTGTC CATCTGGAAG TGGAAGACGA TCGTCTTCTG
CTGATTAAAG GCCGCTACGA AGGTATTGGC GGTTTCCCGA TTGGTACCCA GGAAGATGTG
CTGTCGCTCA TTTCTGGTGG TTTCGACTCC GGCGTTTCCA GTTATATGTT GATGCGTCGC
GGATGTCGTG TGCATTACTG CTTCTTTAAC CTTGGCGGCG CGGCGCATGA AATTGGCGTG
CGTCAGGTAG CGCATTATCT GTGGAACCGC TTTGGCAGCT CCCACCGCGT GCGTTTTGTC
GCTATTAATT TCGAACCGGT CGTCGGGGAA ATTCTCGAGA AAATCGACGA CGGTCAGATG
GGCGTTATCC TCAAACGTAT GATGGTGCGT GCCGCGTCCA AAGTGGCAGA ACGTTACGGC
GTTCAGGCGC TGGTTACCGG TGAAGCGCTC GGCCAGGTGT CCAGCCAGAC GCTGACCAAC
CTGCGCCTGA TTGATAACGT CTCCGATACG TTGATCCTGC GTCCGCTGAT CTCTTACGAC
AAAGAGCACA TCATCAACCT GGCTCGCCAG ATTGGCACCG AAGACTTTGC CCGCACGATG
CCGGAATATT GCGGCGTGAT TTCCAAAAGC CCGACGGTGA AAGCGGTTAA ATCGAAGATT
GAAGCGGAAG AAGAGAAGTT TGACTTCAGC ATTCTCGATA AAGTGGTTGA GGAAGCGAAT
AACGTTGATA TCCGCGAAAT CGCCCAGCAG ACCGAGCAGG AAGTGGTGGA AGTGGAAACC
GTCAATGGCT TCGGCCCGAA CGACGTGATC CTCGATATCC GTTCTATCGA TGAACAGGAA
GATAAGCCAC TGAAAGTCGA AGGGATTGAT GTGGTTTCTC TGCCGTTCTA TAAACTGAGC
ACCAAATTTG GCGATCTCGA CCAGAGCAAA ACCTGGCTGT TGTGGTGTGA GCGCGGGGTG
ATGAGCCGCC TGCAGGCGCT CTATCTGCGC GAGCAGGGCT TTAACAATGT GAAGGTGTAT
CGCCCGTAA
 
Protein sequence
MKFIIKLFPE ITIKSQSVRL RFIKILTGNI RNVLKHYDET LAVVRHWDNI EVRAKDENQR 
LAIRDALTRI PGIHHILEVE DVPFTDMHDI FEKALVQYRD QLEGKTFCVR VKRRGKHDFS
SIDVERYVGG GLNQHIESAR VKLTNPDVTV HLEVEDDRLL LIKGRYEGIG GFPIGTQEDV
LSLISGGFDS GVSSYMLMRR GCRVHYCFFN LGGAAHEIGV RQVAHYLWNR FGSSHRVRFV
AINFEPVVGE ILEKIDDGQM GVILKRMMVR AASKVAERYG VQALVTGEAL GQVSSQTLTN
LRLIDNVSDT LILRPLISYD KEHIINLARQ IGTEDFARTM PEYCGVISKS PTVKAVKSKI
EAEEEKFDFS ILDKVVEEAN NVDIREIAQQ TEQEVVEVET VNGFGPNDVI LDIRSIDEQE
DKPLKVEGID VVSLPFYKLS TKFGDLDQSK TWLLWCERGV MSRLQALYLR EQGFNNVKVY
RP