Gene EcSMS35_1038 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_1038 
SymbolhisB 
ID6144807 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp1058565 
End bp1059632 
Gene Length1068 bp 
Protein Length355 aa 
Translation table11 
GC content53% 
IMG OID641615925 
Productimidazole glycerol-phosphate dehydratase/histidinol phosphatase 
Protein accessionYP_001743117 
Protein GI170683419 
COG category[E] Amino acid transport and metabolism 
COG ID[COG0131] Imidazoleglycerol-phosphate dehydratase
[COG0241] Histidinol phosphatase and related phosphatases 
TIGRFAM ID[TIGR01261] histidinol-phosphatase
[TIGR01656] histidinol-phosphate phosphatase family domain
[TIGR01662] HAD-superfamily hydrolase, subfamily IIIA 


Plasmid Coverage information

Num covering plasmid clones29 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones42 
Fosmid unclonability p-value0.272944 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAGTCAGA AGTATCTTTT TATCGATCGC GATGGAACCC TGATTAGCGA ACCGCCGAGT 
GATTTTCAGG TGGACCGTTT TGACAAACTC GCCTTTGAAC CGGGCGTGAT CCCGGAACTG
CTGAAGCTGC AAAAAGCGGG CTACAAACTG GTGATGATCA CTAATCAGGA TGGTCTGGGA
ACACAAAGTT TCCCGCAGGC GGATTTTGAT GGACCGCACA ACCTGATGAT GCAGATCTTC
ACCTCGCAAG GCGTGCAGTT TGATGAAGTG CTGATTTGTC CGCACCTGCC CGCAGATGAG
TGCGACTGCC GTAAACCGAA AGTAAAACTG GTGGAGCGTT ATCTCGCTGA GCAAGCGATG
GATCGTGCCA ACAGTTATGT GATTGGCGAT CGCGCGACCG ATATTCAACT GGCGGAAAAC
ATGGGTATTA ATGGTTTACG CTACGACCGC GAAACCCTGA ACTGGCCGAT GATTGGCGAG
CAACTCACTA AACGAGACCG TTACGCCCAC GTAGTGCGCA ACACCAAAGA GACGCAAATT
GACGTCCAGG TGTGGCTGGA TCGTGAAGGT GGCAGCAAGA TTAACACCGG CGTTGGCTTC
TTTGATCACA TGCTGGATCA GATCGCCACC CACGGCGGTT TCCGCATGGA AATCAACGTC
AAAGGCGACC TCTATATCGA CGATCACCAC ACCGTCGAAG ATACCGGCCT GGCGCTGGGC
GAAGCGTTAA AAATTGCCCT CGGCGATAAA CGCGGTATTT GTCGCTTTGG TTTTGTGCTG
CCGATGGACG AATGCCTTGC CCGCTGCGCG CTGGATATCT CTGGTCGCCC GCACCTGGAA
TATAAAGCCG AATTTACCTA CCAGCGCGTG GGCGATCTCA GCACTGAAAT GATCGAGCAC
TTCTTCCGTT CGCTCTCTTA CACTATGGGC GTGACGCTAC ACCTGAAAAC CAAAGGTAAA
AACGATCATC ACCGTGTAGA GAGCCTGTTC AAAGCCTTTG GTCGCACCCT GCGCCAGGCC
ATCCGCGTGG AAGGCGACAC CCTGCCCTCG TCAAAAGGAG TGCTGTAA
 
Protein sequence
MSQKYLFIDR DGTLISEPPS DFQVDRFDKL AFEPGVIPEL LKLQKAGYKL VMITNQDGLG 
TQSFPQADFD GPHNLMMQIF TSQGVQFDEV LICPHLPADE CDCRKPKVKL VERYLAEQAM
DRANSYVIGD RATDIQLAEN MGINGLRYDR ETLNWPMIGE QLTKRDRYAH VVRNTKETQI
DVQVWLDREG GSKINTGVGF FDHMLDQIAT HGGFRMEINV KGDLYIDDHH TVEDTGLALG
EALKIALGDK RGICRFGFVL PMDECLARCA LDISGRPHLE YKAEFTYQRV GDLSTEMIEH
FFRSLSYTMG VTLHLKTKGK NDHHRVESLF KAFGRTLRQA IRVEGDTLPS SKGVL