Gene EcSMS35_4638 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_4638 
Symbol 
ID6144675 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4738863 
End bp4740395 
Gene Length1533 bp 
Protein Length510 aa 
Translation table11 
GC content57% 
IMG OID641619454 
Producthypothetical protein 
Protein accessionYP_001746562 
Protein GI170680990 
COG category[G] Carbohydrate transport and metabolism
[S] Function unknown 
COG ID[COG0062] Uncharacterized conserved protein
[COG0063] Predicted sugar kinase 
TIGRFAM ID[TIGR00196] yjeF C-terminal region, hydroxyethylthiazole kinase-related
[TIGR00197] yjeF N-terminal region 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.000291773 
Plasmid hitchhikingYes 
Plasmid clonabilityhitchhiker 
 

Fosmid Coverage information

Num covering fosmid clones34 
Fosmid unclonability p-value0.0355337 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAAGAAAA ACCCCGTAAG TATACCACAC ACCGTCTGGC ACGCCGACGA TATCCGCCGC 
GGAGAACGCG AGGCGGTAGA TGCGCTGGGG CTCACACTCT ATGAGCTGAT GCTTCGCGCT
GGCGAAGCCG CATTCCAGGT GTGTCGTTCG GCTTATCCTG ACGCCCGCCA CTGGCTGGTG
CTGTGCGGTC ATGGTAATAA CGGCGGCGAT GGCTACGTGG TCGCGCGACT GGCCAAAGCG
GTCGGTATTG ATGTCACGCT GCTGGCCCAG GAAAGTGACA AACCGTTGCC GGAAGAGGCC
GCGCTGGCAC GCGAAGCATG GTTAAACGCG GGAGGCGAGA TCCATGCTTC GAATATTGTC
TGGCCCGAAT CGGTAGATCT GATTGTTGAT GCGCTGCTCG GTACCGGATT ACAGCAAGCG
CCTCGTGAAT CCATTAGCCA GTTAATCGAC CACGCTAATT CCCATCCTGC GCCGATTGTG
GCGGTTGATA TCCCTTCCGG CCTGCTGGCT GAAACTGGCG CTACGCCAGG CGCGGTGATC
AACGCCGATC ACACCATCAC TTTTATTGCG CTGAAACCAG GCTTGCTCAC TGGAAAAGCG
CGGGATGTTA CAGGACAACT GCATTTTGAC TCACTGGGGC TGGATAGCTG GCTGGCAGGT
CAGGAGACGA AAATTCAGCG GTTTTCGGCA GAACAACTTT CTCACTGGCT GAAACCGCGT
CGCCCGACTT CGCATAAAGG CGATCACGGG CGGCTGGTGA TTATCGGTGG CGATCACGGC
ACGGCGGGGG CTATTCGTAT GACGGGGGAA GCGGCGCTGC GTGCTGGTGC TGGTTTAGTC
CGAGTACTGA CCCGCAGTGA GAACATTGCG CCGCTGCTGA CTGCACGACC AGAATTGATG
GTGCATGAAC TGACTATGGA CTCTCTTACC GAAAGCCTGG AATGGGCCGA TGTGGTGGTG
ATTGGTCCCG GTCTGGGCCA GCAAGAGTGG GGGAAAAAAG CCCTGCAAAA AGTTGAGAAT
TTTCGCAAAC CGATGTTGTG GGATGCCGAT GCATTGAACC TGCTGGCAAT CAATCCCGAT
AAGCGTCACA ATCGCGTGAT CACGCCGCAT CCTGGCGAGG CCGCACGGTT GTTAGGCTGT
TCCGTCGCTG AAATTGAAAG TGACCGCTTA CATTGCGCCC AACGTCTGGT ACAACGTTAT
GGCGGCGTAG CGGTGCTGAA AGGTGCCGGA ACCGTGGTCG CCGCCCATTC TGACGCTTTA
GGCATTATTG ATGTCGGAAA TGCAGGCATG GCGAGCGGCG GCATGGGCGA TGTGCTCTCT
GGTATTATTG GCGCATTGCT TGGGCAAAAA ATGAGCCCTT ATGATGCAGC CTGTGCGGGC
TGTGTCGCGC ACGGTGCGGC AGCTGACGTA CTGGCGGCGC GTTTTGGAAC GCGCGGGATG
CTGGCAACCG ATCTCTTTTC CACGCTACAG CGTATTGTTA ACCCGGAAGT GACTGATAAA
AACCATGATG AATCGAGTAA TTCCGCTCCC TGA
 
Protein sequence
MKKNPVSIPH TVWHADDIRR GEREAVDALG LTLYELMLRA GEAAFQVCRS AYPDARHWLV 
LCGHGNNGGD GYVVARLAKA VGIDVTLLAQ ESDKPLPEEA ALAREAWLNA GGEIHASNIV
WPESVDLIVD ALLGTGLQQA PRESISQLID HANSHPAPIV AVDIPSGLLA ETGATPGAVI
NADHTITFIA LKPGLLTGKA RDVTGQLHFD SLGLDSWLAG QETKIQRFSA EQLSHWLKPR
RPTSHKGDHG RLVIIGGDHG TAGAIRMTGE AALRAGAGLV RVLTRSENIA PLLTARPELM
VHELTMDSLT ESLEWADVVV IGPGLGQQEW GKKALQKVEN FRKPMLWDAD ALNLLAINPD
KRHNRVITPH PGEAARLLGC SVAEIESDRL HCAQRLVQRY GGVAVLKGAG TVVAAHSDAL
GIIDVGNAGM ASGGMGDVLS GIIGALLGQK MSPYDAACAG CVAHGAAADV LAARFGTRGM
LATDLFSTLQ RIVNPEVTDK NHDESSNSAP