Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | EcSMS35_4437 |
Symbol | thiH |
ID | 6146882 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Escherichia coli SMS-3-5 |
Kingdom | Bacteria |
Replicon accession | NC_010498 |
Strand | - |
Start bp | 4535266 |
End bp | 4536399 |
Gene Length | 1134 bp |
Protein Length | 377 aa |
Translation table | 11 |
GC content | 55% |
IMG OID | 641619257 |
Product | thiamine biosynthesis protein ThiH |
Protein accession | YP_001746373 |
Protein GI | 170681122 |
COG category | [H] Coenzyme transport and metabolism [R] General function prediction only |
COG ID | [COG1060] Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes |
TIGRFAM ID | [TIGR02351] thiazole biosynthesis protein ThiH |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 30 |
Plasmid unclonability p-value | 1 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | 22 |
Fosmid unclonability p-value | 0.0000208104 |
Fosmid Hitchhiker | Yes |
Fosmid clonability | hitchhiker |
| |
Sequence |
Gene sequence | ATGAAAAACT TCAGCGATCG CTGGCGACAA CTGGACTGGG ACGACATCCG CCTGCGTATC AACGGCAAAA CGGCTGCTGA CGTAGAGCGG GCGCTAAATG CCTCGCAACT CACCCGCGAC GATATGATGG CGCTGTTATC GCCAGCCGCC ATTGGCTATC TGGAACCACT GGCCCAACGG GCGCAGCGTC TGACCCGTCA ACGATTTGGC AACACTGTTA GCTTCTACGT CCCGCTTTAT CTTTCCAATC TTTGCGCTAA CGACTGCACG TACTGCGGAT TCTCTATGAG CAACCGCATC AAGCGCAAAA CGCTGGATGA AGCGGATATT GCCAGGGAAA GCGCCGCTAT ACGGGAGATG GGCTTTGAAC ATCTGCTTTT AGTCACTGGT GAACATCAGG CGAAAGTGGG GATGGATTAC TTTCGTCGTC ATCTCCCTGC CCTGCGTGAA CAGTTCTCTT CACTACAGAT GGAAGTGCAA CCGCTGGCGG AGACGGAATA CGCCGAGTTA AAGCAACTAG GTCTGGATGG CGTGATGGTT TATCAGGAGA CATATCACGA GGCGACTTAT GCCCGCCATC ATCTGAAAGG TAAAAAACAG GACTTCTTCT GGCGGCTGGA AACGCCGGAT CGGCTAGGGC ATGCGGGGAT TGATAAGATA GGCCTCGGCG CGCTAATCGG CCTTTCCGAC AACTGGCGAG TTGACTGCTA TATGGTTGCC GAACATTTGC TATGGCTGCA ACAACATTAC TGGCAAAGCC GCTACTCTGT CTCCTTCCCA CGCCTGCGTC CATGTACTGG CGGCATTGAG CCTGCGTCGA TTATGGATGA ACGCCAGTTA GTGCAAACCA TCTGCGCCTT CCGGCTGCTT GCACCGGAGA TTGAACTGTC ACTCTCCACG CGGGAATCAC CGTGGTTTCG CGATCGCGTT ATTCCGCTGG CAATTAATAA CGTCAGCGCT TTTTCGAAAA CGCAGCCAGG TGGCTATGCC GACAATCACC CCGAGCTGGA ACAGTTCTCA CCACACGACG ATCGCAGGCC GGAAGCGGTT GCTGCCGCGT TAACCGCTCA GGGTTTGCAG CCGGTATGGA AAGACTGGGA CAGCTATCTG GGACGCGCCT CGCAAAGGCC ATGA
|
Protein sequence | MKNFSDRWRQ LDWDDIRLRI NGKTAADVER ALNASQLTRD DMMALLSPAA IGYLEPLAQR AQRLTRQRFG NTVSFYVPLY LSNLCANDCT YCGFSMSNRI KRKTLDEADI ARESAAIREM GFEHLLLVTG EHQAKVGMDY FRRHLPALRE QFSSLQMEVQ PLAETEYAEL KQLGLDGVMV YQETYHEATY ARHHLKGKKQ DFFWRLETPD RLGHAGIDKI GLGALIGLSD NWRVDCYMVA EHLLWLQQHY WQSRYSVSFP RLRPCTGGIE PASIMDERQL VQTICAFRLL APEIELSLST RESPWFRDRV IPLAINNVSA FSKTQPGGYA DNHPELEQFS PHDDRRPEAV AAALTAQGLQ PVWKDWDSYL GRASQRP
|
| |