Gene EcSMS35_4476 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_4476 
Symbol 
ID6145425 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4573972 
End bp4575201 
Gene Length1230 bp 
Protein Length409 aa 
Translation table11 
GC content54% 
IMG OID641619292 
ProductL-sorbose 1-phosphate reductase 
Protein accessionYP_001746404 
Protein GI170682637 
COG category[E] Amino acid transport and metabolism
[R] General function prediction only 
COG ID[COG1063] Threonine dehydrogenase and related Zn-dependent dehydrogenases 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones22 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones42 
Fosmid unclonability p-value0.240013 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAAAACAA CAGCTCTGCG TCTTTATGGC AAACGTGACC TGCGTCTGGA AACCTTTGAC 
CTTCCTGAAA TGCAGGAGGA TGAAATCCTC GCGACGGTGG TCACTGACAG CCTGTGCCTC
TCTTCCTGGA AAGAGGCCAA TCTGGGGGAA AACCATAAAA AAGTACCCGA CGATGTGGCG
ACCAATCCCA TCATCATCGG CCACGAATTT TGCGGCGATA TTCTGGCCGT GGGTAAAAAG
TGGCAGCACA AATTCCAGCC GGGTCAGCGT TATGTGATTC AGGCCAACCT GCAACTCCCC
GACCGCCCGG ACTGCCCCGG CTACTCCTTC CCGTGGGTAG GCGGCGAAGC GACGCATGTG
GTTATTCCCA ACGAGGTCAT GGAACAAGAT TGCCTGCTGG CATACGACGG CGAAACCTAT
TTTGAAGGCT CGCTGGTTGA ACCGCTTTCC TGCGTGATTG GCGCGTTCAA CGCCAACTAT
CATCTTCAGG AAGGTAGTTA TAACCACACG ATGGGGATTC GCCCGCAAGG GCGCACGCTG
ATCCTCGGCG GCACCGGACC AATGGGACTG TTGGCGATTG ATTATGCGCT GCATGGACCC
GTTAACCCGT CGCTGCTCGT TATTACCGAT ACCGACAACG ACAAATTGAG TTATGCGCGC
AAGCACTATC CATCAGAACC GCAAACACTG ATTCATTATC TCAATGCCAC CGATGCAGCA
TTTGATACGT TAATGGCGCT GAGTGGCGGT CACGGCTTCG ATGATATTTT CGTCTTTGTG
CCTAATGAAG GACTGGTGAC TCTCGCCTCT TCCTTGCTGG CGACAGATGG TTGCCTGAAT
TTCTTCGCCG GACCGCAGGA TAAACATTTC AGCGCGCCAA TTAATTTCTA CGATGTGCAT
TATGCATTTA CCCACTACGT GGGCACGTCA GGCGGCAATA CCGACGACAT GCGCGCGGCG
GTCAAACTGA TCGAAGAGAA AAAAGTGCAG GCCGCAAAAG TGGTAACACA TATTCTTGGG
CTGAATGCCG CGGGCGAAAC CACGCTTGAA TTGCCTGCCG TCGGCGGCGG GAAAAAACTG
GTGTATACCG GGAAATATCT GCCGCTGACA TCACTCACGC AGATTCAGGA TGAAGAACTG
GCGGCGATTC TGGCGCGTCA TCAGGGAGTC TGGTCCGGCG AGGCGGAGCA GTACCTGCTC
GCCCATGCAG AGGCGATTTC CCATGATTAA
 
Protein sequence
MKTTALRLYG KRDLRLETFD LPEMQEDEIL ATVVTDSLCL SSWKEANLGE NHKKVPDDVA 
TNPIIIGHEF CGDILAVGKK WQHKFQPGQR YVIQANLQLP DRPDCPGYSF PWVGGEATHV
VIPNEVMEQD CLLAYDGETY FEGSLVEPLS CVIGAFNANY HLQEGSYNHT MGIRPQGRTL
ILGGTGPMGL LAIDYALHGP VNPSLLVITD TDNDKLSYAR KHYPSEPQTL IHYLNATDAA
FDTLMALSGG HGFDDIFVFV PNEGLVTLAS SLLATDGCLN FFAGPQDKHF SAPINFYDVH
YAFTHYVGTS GGNTDDMRAA VKLIEEKKVQ AAKVVTHILG LNAAGETTLE LPAVGGGKKL
VYTGKYLPLT SLTQIQDEEL AAILARHQGV WSGEAEQYLL AHAEAISHD