Gene EcSMS35_1940 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_1940 
SymboldhaR 
ID6145166 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp1960779 
End bp1962698 
Gene Length1920 bp 
Protein Length639 aa 
Translation table11 
GC content51% 
IMG OID641616816 
ProductDNA-binding transcriptional regulator DhaR 
Protein accessionYP_001743992 
Protein GI170682289 
COG category[K] Transcription
[Q] Secondary metabolites biosynthesis, transport and catabolism 
COG ID[COG3284] Transcriptional activator of acetoin/glycerol metabolism 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones14 
Plasmid unclonability p-value0.407613 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones38 
Fosmid unclonability p-value0.0958242 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAGTGGCG CTTTTAACAA CGATGGTCGG GACATATCTC CCTTAATTGC AACCTCCTGG 
GAGCGATGCA TTAAGCTGAT GAAACGGGAG ACATGGAACG TACCACACCA GGCCCAGGGC
GTGACATTTG CTTCTATTTA TCGGCGTAAG AAAGCGATGC TGACGCTCGG GCAGGCTGCG
CTGGAAGATG CCTGGGAATA TATGGCACCG CGAGAGTGTG CGTTGCTTAT CCTCGATGAA
ACCGCCTGCA TTCTCAGCCG TAATGGCGAT CCGCAAACCT TGCAGCAGCT AAGTGCACTG
GGATTCAATG ACGGCACGTA TTGCGCCGAG GGAATTATTG GTACTTGTGC GCTATCATTA
GCGGCTATCT CTGGTCAGGC CGTGAAAACG ATGGCCGATC AACATTTCAA ACAGGCACTC
TGGAACTGGG CCTTTTGTGC AACGCCGTTG TTTGACAGCA AGGGCCGATT GACGGGAACA
ATAGCGCTGG CGTGTCCGGT TGAACAAACT ACCGCAGCTG ATTTGCCGTT GACGTTGGCA
ATCGCACGCG AGGTCGGAAA TTTACTGCTG ACGGACAGTT TGCTCGCTGA AACTAACCGT
CATTTAAATC AACTTAATGC CCTGTTAGAA AGTATGGATG ATGGCGTGAT TAGCTGGGAC
GAGCAGGGTA ATTTGCAATT TATCAATGCC CAGGCGGCGC GGGTCTTGCG CCTTGACGCG
ACGGCAAGTC AGGGAAGGGC GATCACTGAA CTCTTAACGT TACCCGCCGT ATTGCAACAA
GCAATAAAAC AGGCACATCC GCTCAAACAC GTAGAAGCAA CCTTTGAAAG TCAGCACCAG
TTTATTGATG CGGTGATAAC CCTTAAACCG ATAATAGAAA CGCAGGGAAC CAGCTTTATT
TTGTTGCTCC ATCCTGTGGA ACAGATGCGG CAGTTAATGA CCAGTCAATT AGGAAAAGTC
AGCCATACTT TCGCTCATAT GCCACAGGAC GATCCGCAAA CCCGCCGCTT GATTCATTTT
GGTCGCCAGG CGGCGCGCAG TAGCCTTCCT GTCCTGCTTT GTGGAGAAGA GGGCGTGGGC
AAGGCACTGC TAAGTCAGGC AATTCATAAT GAAAGTGAGC GTGCAGCGGG GCCTTATATC
GCCGTCAATT GTGAGTTATA TGGTGATGCG GCGCTGGCGG AAGAATTTAT TGGCGGCGAT
CGCACGGACA ATGAAAATGG TCGTCTGAGT CGGCTGGAAC TGGCGCACGG CGGCACGCTG
TTTCTTGAAA AGATTGAATA TCTGGCGGTG GAGTTACAGT CTGCTTTGCT TCAGGTTATC
AAGCAGGGGG TTATCACGCG ACTGGATGCG CGGCGTTTAA TACCCATTGA TGTCAAAGTG
ATTGCCACAA CGACCGCGGA CCTCGCAATG CTGGTGGAAC AAAATCGTTT TAGTCGCCAG
CTGTATTACG CGCTGCATGC ATTTGAAATT ACCATCCCGC CACTGCGTAT GCGGCGTGGC
AGCATTCCGG CGCTGGTGAA TAACAAATTA CGCAGCCTTG AAAAACGCTT CTCTACGCGG
CTGAAAATTG ATGACGATGC CCTCGCTCGC CTGGTTTCTT GTGCATGGCC AGGCAACGAT
TTTGAACTTT ACAGCGTCAT CGAGAATCTT GCTCTGAGTA GTGATAACGG GCGCATTCGC
GTCAGTGATT TACCTGAACA TCTGTTTACC GAGCAGGCGA CAGATGATGT CAGCGCCACT
CGCCTTACCA CCAGTCTGTC ATTTGCGGAA GTTGAAAAAG AGGCAATTAT TAACGCAGCC
CAGGTCACAG GCGGTCGCAT TCAGGAAATG TCGGCTTTAC TTGGAATCGG CCGCACTACG
CTGTGGCGGA AAATGAAGCA ACATGGCATT GATGCAGGGC AGTTTAAGCG CCGTGGATAG
 
Protein sequence
MSGAFNNDGR DISPLIATSW ERCIKLMKRE TWNVPHQAQG VTFASIYRRK KAMLTLGQAA 
LEDAWEYMAP RECALLILDE TACILSRNGD PQTLQQLSAL GFNDGTYCAE GIIGTCALSL
AAISGQAVKT MADQHFKQAL WNWAFCATPL FDSKGRLTGT IALACPVEQT TAADLPLTLA
IAREVGNLLL TDSLLAETNR HLNQLNALLE SMDDGVISWD EQGNLQFINA QAARVLRLDA
TASQGRAITE LLTLPAVLQQ AIKQAHPLKH VEATFESQHQ FIDAVITLKP IIETQGTSFI
LLLHPVEQMR QLMTSQLGKV SHTFAHMPQD DPQTRRLIHF GRQAARSSLP VLLCGEEGVG
KALLSQAIHN ESERAAGPYI AVNCELYGDA ALAEEFIGGD RTDNENGRLS RLELAHGGTL
FLEKIEYLAV ELQSALLQVI KQGVITRLDA RRLIPIDVKV IATTTADLAM LVEQNRFSRQ
LYYALHAFEI TIPPLRMRRG SIPALVNNKL RSLEKRFSTR LKIDDDALAR LVSCAWPGND
FELYSVIENL ALSSDNGRIR VSDLPEHLFT EQATDDVSAT RLTTSLSFAE VEKEAIINAA
QVTGGRIQEM SALLGIGRTT LWRKMKQHGI DAGQFKRRG