Gene EcSMS35_1039 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_1039 
SymbolhisC 
ID6142706 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp1059632 
End bp1060702 
Gene Length1071 bp 
Protein Length356 aa 
Translation table11 
GC content56% 
IMG OID641615926 
Producthistidinol-phosphate aminotransferase 
Protein accessionYP_001743118 
Protein GI170683323 
COG category[E] Amino acid transport and metabolism 
COG ID[COG0079] Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 
TIGRFAM ID[TIGR01141] histidinol-phosphate aminotransferase 


Plasmid Coverage information

Num covering plasmid clones27 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones41 
Fosmid unclonability p-value0.191971 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAGCACCG TGACTATTAC CGATTTAGCG CGTGAGAACG TCCGCAACCT GACGCCGTAT 
CAGTCGGCGC GTCGTCTGGG CGGTAACGGC GACGTCTGGC TGAACGCCAA CGAATACCCC
ACAGCCGTGG AGTTTCAGCT TACTCAGCAA ACGCTCAACC GCTACCCGGA ATGCCAGCCG
AAAGCGGTGA TTGAAAATTA CGCGCAGTAT GCAGGCGTGA AGGCGGAGCA GGTGCTGGTC
AGCCGTGGCG CGGACGAAGG TATTGAACTG CTGATTCGCG CTTTTTGCGA ACCGGGTAAA
GACGCCATCC TCTACTGCCC GCCAACGTAC GGCATGTACA GCGTCAGCGC CGAAACCATT
GGCGTCGAGT GCCGCACAGT GCCGACGCTG AAAAACTGGC AACTGGACTT GCAGGGCATT
TCCGACAAGC TGGACGGCGT AAAAGTGGTT TATGTTTGCA GCCCCAACAA CCCGACCGGG
CAACTGATCA ATCCACAGGA TTTTCGCACC CTGCTGGAGT TAACGCGCGG TAAAGCGATT
GTGGTTGCCG ATGAGGCCTA TATCGAGTTT TGCCCGCAGG CATCGCTGGC TGGCTGGCTG
GCGGAATATC CGCACCTGGC TATTTTGCGC ACACTGTCGA AAGCCTTCGC TCTGGCGGGC
CTTCGTTGCG GATTTACGCT GGCAAACGAA GAAGTCATCA ACCTGCTGAT GAAAGTGATC
GCCCCCTACC CGCTCTCGAC GCCGGTTGCC GACATTGCAG CCCAGGCGTT AAGCCCGCAG
GGGATCGTCG CCATGCGCGA ACGAGTGGCG CAAATTATTG CTGAACGCGA ATACCTGATG
GCCGCACTGA AAGAGATCCC CTGCGTGGAG CAGGTTTTCG ACTCCGAAAC CAACTACATT
CTGGCGCGCT TTAAAGCCTC CAGCGCAGTG TTTAAATCTT TGTGGGATCA GGGCATTATC
TTACGTGATC AGAATAAACA ACCCTCTTTA AGCGGCTGCC TGCGAATTAC CGTCGGAACC
CGTGAAGAAA GCCAGCGCGT CATTGACGCC TTACGTGCGG AGCAAGTTTG A
 
Protein sequence
MSTVTITDLA RENVRNLTPY QSARRLGGNG DVWLNANEYP TAVEFQLTQQ TLNRYPECQP 
KAVIENYAQY AGVKAEQVLV SRGADEGIEL LIRAFCEPGK DAILYCPPTY GMYSVSAETI
GVECRTVPTL KNWQLDLQGI SDKLDGVKVV YVCSPNNPTG QLINPQDFRT LLELTRGKAI
VVADEAYIEF CPQASLAGWL AEYPHLAILR TLSKAFALAG LRCGFTLANE EVINLLMKVI
APYPLSTPVA DIAAQALSPQ GIVAMRERVA QIIAEREYLM AALKEIPCVE QVFDSETNYI
LARFKASSAV FKSLWDQGII LRDQNKQPSL SGCLRITVGT REESQRVIDA LRAEQV