Gene EcSMS35_1040 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_1040 
SymbolhisD 
ID6142608 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp1060699 
End bp1062003 
Gene Length1305 bp 
Protein Length434 aa 
Translation table11 
GC content57% 
IMG OID641615927 
Producthistidinol dehydrogenase 
Protein accessionYP_001743119 
Protein GI170683888 
COG category[E] Amino acid transport and metabolism 
COG ID[COG0141] Histidinol dehydrogenase 
TIGRFAM ID[TIGR00069] histidinol dehydrogenase 


Plasmid Coverage information

Num covering plasmid clones30 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones41 
Fosmid unclonability p-value0.204161 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAGCTTTA ACACAATCAT TGACTGGAAT AGCTGTACTG CGGAGCAACA ACGCCAGCTG 
TTAATGCGCC CGGCGATCTC CGCTTCTGAA AGCATTACCC GCACTGTTAA CGATATTCTC
GATAACGTGA AAGCACGCGG TGATGACGCC CTGCGGGAAT ACAGCGCGAA GTTTGATAAA
ACCACGGTTA CCGCACTGAA GGTGTCTGCT GAGGAAATTG CCGCCGCCAG CGAACGCCTG
AGCGACGAGC TAAAACAGGC GATGGCGGTG GCAGTAAAGA ATATTGAAAC CTTCCACACT
GCGCAAAAAC TGCCGCCGGT AGATGTTGAA ACTCAGCCTG GTGTGCGTTG CCAGCAGGTC
ACGCGCCCGG TAGCTTCGGT TGGTTTGTAT ATTCCTGGCG GTTCCGCCCC GCTCTTCTCA
ACGGTATTAA TGCTGGCGAC TCCGGCGCGT ATTGCGGGTT GCAATAAAGT GGTGCTGTGC
TCACCGCCGC CGATTGCCGA TGAAATTCTT TACGCGGCGC AGCTATGCGG TGTGCAGGAC
GTGTTTAACG TCGGCGGCGC ACAGGCCATT GCCGCGCTGG CGTTTGGTAC GGAATCTGTG
CCGAAAGTGG ACAAAATCTT CGGGCCGGGT AACGCCTTTG TCACCGAAGC GAAACGTCAG
GTGAGCCAGC GTCTGGACGG TGCGGCGATC GATATGCCCG CAGGCCCGTC GGAAGTGCTG
GTGATTGCTG ACAGCGGCGC TACGCCGGAT TTCGTGGCTT CTGATTTGCT TTCTCAGGCT
GAACACGGCC CGGATTCACA GGTGATTTTA CTGACGCCTG ACGCTGATAT GGCGCATCAA
GTTGCCGAAG CCGTCGAACG CCAGTTAGCA GAACTGCCGC GTGCCGAAAC CGCACGTCAG
GCACTGAGCG CCAGCCGCCT GATCGTGACC AACGATTTAG CGCAGTGCGT GGCAATCTCC
AACCAGTACG GCCCGGAGCA CCTGATCATT CAGACCCGCA ACGCCCGCGA ACTGGTCGAT
AGCATCACCA GCGCCGGTTC GGTATTTCTT GGTGACTGGT CACCGGAATC GGCAGGTGAT
TACGCCTCCG GCACCAACCA CGTTCTGCCG ACTTACGGTT ACACCGCCAC CTGTTCCAGC
CTCGGACTGG CGGATTTCCA GAAGCGGATG ACCGTGCAGG AACTGTCGAA AGTAGGTTTC
TCCGCGCTGG CTTCGACCAT TGAAACACTG GCCGCCGCCG AGCGCCTGAC CGCCCACAAA
AATGCCGTTA CTTTGCGTGT TAACGCCCTT AAGGAGCAAG CATGA
 
Protein sequence
MSFNTIIDWN SCTAEQQRQL LMRPAISASE SITRTVNDIL DNVKARGDDA LREYSAKFDK 
TTVTALKVSA EEIAAASERL SDELKQAMAV AVKNIETFHT AQKLPPVDVE TQPGVRCQQV
TRPVASVGLY IPGGSAPLFS TVLMLATPAR IAGCNKVVLC SPPPIADEIL YAAQLCGVQD
VFNVGGAQAI AALAFGTESV PKVDKIFGPG NAFVTEAKRQ VSQRLDGAAI DMPAGPSEVL
VIADSGATPD FVASDLLSQA EHGPDSQVIL LTPDADMAHQ VAEAVERQLA ELPRAETARQ
ALSASRLIVT NDLAQCVAIS NQYGPEHLII QTRNARELVD SITSAGSVFL GDWSPESAGD
YASGTNHVLP TYGYTATCSS LGLADFQKRM TVQELSKVGF SALASTIETL AAAERLTAHK
NAVTLRVNAL KEQA