Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | EcHS_A4223 |
Symbol | thiH |
ID | 5591086 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Escherichia coli HS |
Kingdom | Bacteria |
Replicon accession | NC_009800 |
Strand | - |
Start bp | 4220317 |
End bp | 4221450 |
Gene Length | 1134 bp |
Protein Length | 377 aa |
Translation table | 11 |
GC content | 55% |
IMG OID | 640923327 |
Product | thiamine biosynthesis protein ThiH |
Protein accession | YP_001460776 |
Protein GI | 157163458 |
COG category | [H] Coenzyme transport and metabolism [R] General function prediction only |
COG ID | [COG1060] Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes |
TIGRFAM ID | [TIGR02351] thiazole biosynthesis protein ThiH |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 66 |
Plasmid unclonability p-value | 1 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGAAAACCT TCAGCGATCG CTGGCGACAA CTGGACTGGG ATGACATCCG CCTGCGTATC AACGGCAAAA CGGCTGTTGA CGTAGAGCGG GCGCTAAATG CCTCGCAATT CACCCGCGAC GATATGATGG CGCTGTTATC GCCAGCCGCC AGTGGCTATC TGGAACAACT GGCCCAACGG GCGCAGCGTC TGACCCGTCA GCGATTTGGC AACACAGTTA GTTTCTACGT CCCGCTTTAT CTTTCCAATC TTTGCGCTAA CGACTGCACG TACTGCGGAT TTTCCATGAG TAATCGCATC AAGCGCAAAA CGCTGGATGA AGCGGATATT GCCAGGGAAA GCGCCGCTAT ACGGGAGATG GGCTTTGAAC ATCTGCTATT AGTCACTGGT GAACATCAGG CGAAAGTGGG GATGGATTAC TTTCGTCGTC ATCTCCCCGC CCTGCGTGAA CAGTTCTCTT CACTACAAAT GGAAGTGCAA CCGCTGGCGG AGACGGAATA CGCCGAGTTA AAACAGCTTG GTCTGGATGG CGTGATGGTT TATCAGGAGA CATATCACGA GGCGACTTAT GCCCGCCATC ATCTGAAAGG AAAAAAACAG GACTTCTTCT GGCGGCTGGA AACGCCGGAT CGGCTGGGGC GTGCGGGGAT TGATAAGATA GGCCTCGGCG CGCTAATTGG CCTTTCCGAC AACTGGCGCG TTGACTGCTA TATGGTTGCC GAACATTTGC TATGGCTGCA ACAGCATTAC TGGCAAAGCC GTTACTCTGT CTCCTTTCCG CGCCTGCGCC CGTGTACTGG CGGCATTGAG CCTGCGTCGA TTATGGATGA ACGCCAGTTA GTGCAAACCA TCTGCGCCTT CCGACTGCTT GCACCGGAGA TTGAACTGTC ACTCTCCACG CGGGAATCAC CGTGGTTTCG CGATCGCGTT ATTCCGCTGG CGATCAATAA CGTCAGCGCC TTCTCGAAAA CGCAGCCAGG TGGCTATGCC GATAATCACC CCGAGTTGGA ACAGTTCTCA CCGCACGACG ATCGCAGACC GGAAGCGGTT GCTGCCGCGT TAACCGCTCA GGGTTTGCAG CCGGTATGGA AAGACTGGGA CAGCTATCTG GGACGGGCCT CGCAAAGACT ATGA
|
Protein sequence | MKTFSDRWRQ LDWDDIRLRI NGKTAVDVER ALNASQFTRD DMMALLSPAA SGYLEQLAQR AQRLTRQRFG NTVSFYVPLY LSNLCANDCT YCGFSMSNRI KRKTLDEADI ARESAAIREM GFEHLLLVTG EHQAKVGMDY FRRHLPALRE QFSSLQMEVQ PLAETEYAEL KQLGLDGVMV YQETYHEATY ARHHLKGKKQ DFFWRLETPD RLGRAGIDKI GLGALIGLSD NWRVDCYMVA EHLLWLQQHY WQSRYSVSFP RLRPCTGGIE PASIMDERQL VQTICAFRLL APEIELSLST RESPWFRDRV IPLAINNVSA FSKTQPGGYA DNHPELEQFS PHDDRRPEAV AAALTAQGLQ PVWKDWDSYL GRASQRL
|
| |