Gene EcSMS35_2753 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_2753 
SymbolaroF 
ID6143986 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp2835049 
End bp2836119 
Gene Length1071 bp 
Protein Length356 aa 
Translation table11 
GC content52% 
IMG OID641617623 
Productphospho-2-dehydro-3-deoxyheptonate aldolase 
Protein accessionYP_001744784 
Protein GI170683035 
COG category[E] Amino acid transport and metabolism 
COG ID[COG0722] 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase 
TIGRFAM ID[TIGR00034] phospho-2-dehydro-3-deoxyheptonate aldolase 


Plasmid Coverage information

Num covering plasmid clones10 
Plasmid unclonability p-value0.0121436 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones33 
Fosmid unclonability p-value0.0145258 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCAAAAAG ACGCGCTGAA TAACGTACAT ATTACCGACG AACAAGTTTT AATGACTCCG 
GAACAACTGA AGGCCGCTTT TCCATTAAGC CTGCAACAGG AAGCCCAGAT TGCTGACTCG
CGTAAAACCA TTTCAGATAT TATCGCCGGG CGCGATCCTC GTCTGTTGGT AGTATGTGGT
CCTTGTTCCA TTCATGATCC GGAAACTGCT CTGGAATATG CTCGTCGATT TAAAGCCCTT
GCCGCAGAGG TCAGCGATAG CCTCTATCTG GTAATGCGCG TCTATTTTGA AAAACCCCGT
ACCACTGTCG GTTGGAAAGG GTTGATTAAC GATCCCCATA TGGATGGCTC TTTTGATGTA
GAAGCCGGGC TGCAGATCGC GCGTAAATTG CTGCTTGAGC TGGTGAATAT GGGACTGCCA
CTGGCGACGG AAGCGTTAGA TCCGAATAGC CCGCAATACC TGGGCGATCT GTTTAGCTGG
TCAGCAATTG GTGCTCGTAC AACGGAATCG CAAACACACC GTGAAATGGC CTCCGGGCTT
TCCATGCCGG TTGGTTTTAA AAACGGCACC GACGGCAGCC TTGCAACGGC TATTAACGCT
ATGCGTGCCG CCGCCCAGCC GCACCGTTTT GTTGGCATTA ACCAGGCAGG GCAGGTGGCG
TTGCTACAAA CTCAGGGGAA TCCGGACGGG CATGTGATCC TGCGCGGTGG TAAAGCGCCG
AACTATAGCC CTGCGGATGT TGCGCAATGT GAAAAAGAGA TGGAACAGGC GGGACTGCGC
CCGTCTCTGA TGGTAGATTG CAGCCACGGT AATTCCAATA AAGATTATCG CCGTCAGCCT
GCGGTGGCAG AATCCGTGGT TGCTCAAATC AAAGATGGCA ATCGCTCAAT TATTGGTCTG
ATGATTGAAA GTAATATCCA CGAGGGCAAT CAGTCTTCCG AGCAACCGCG CAGTGAAATG
AAATACGGTG TATCCGTCAC CGATGCCTGC ATTAGTTGGG AAATGACTGA TGCCTTGCTG
CGTGAAATTC ATCAGGATCT GAACGGGCAG CTGACGGCTC GCGTGGCTTA A
 
Protein sequence
MQKDALNNVH ITDEQVLMTP EQLKAAFPLS LQQEAQIADS RKTISDIIAG RDPRLLVVCG 
PCSIHDPETA LEYARRFKAL AAEVSDSLYL VMRVYFEKPR TTVGWKGLIN DPHMDGSFDV
EAGLQIARKL LLELVNMGLP LATEALDPNS PQYLGDLFSW SAIGARTTES QTHREMASGL
SMPVGFKNGT DGSLATAINA MRAAAQPHRF VGINQAGQVA LLQTQGNPDG HVILRGGKAP
NYSPADVAQC EKEMEQAGLR PSLMVDCSHG NSNKDYRRQP AVAESVVAQI KDGNRSIIGL
MIESNIHEGN QSSEQPRSEM KYGVSVTDAC ISWEMTDALL REIHQDLNGQ LTARVA