Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Cthe_1452 |
Symbol | |
ID | 4810602 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Clostridium thermocellum ATCC 27405 |
Kingdom | Bacteria |
Replicon accession | NC_009012 |
Strand | + |
Start bp | 1772311 |
End bp | 1773966 |
Gene Length | 1656 bp |
Protein Length | 551 aa |
Translation table | 11 |
GC content | 43% |
IMG OID | 640106874 |
Product | KWG repeat-containing protein |
Protein accession | YP_001037875 |
Protein GI | 125973965 |
COG category | |
COG ID | |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 6 |
Plasmid unclonability p-value | 0.00334104 |
Plasmid hitchhiking | Yes |
Plasmid clonability | hitchhiker |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGAAAAAAC ATTTGCTAAT CATCGTATCA GCACTCGTTT CAGTACTGTC ATTGAACCTC ATTGCAGTAA CGGCAGCTGA GCCCACAGTC AAAAACATTT ACTATGACGA TGTGTATTCC TTTTCAGAAG GAATGTCCAG AGTGGTTAAA AACGGCAAGT ATGGATTCAT GGATCAAACA GGTAACGTTG CCATCAACCT GGAATACGAT TATGTAAGAG ATTTTTCCGA CGGGTTTGCC GCAGTTTCAA AAGGCGGTAC ATGGTATGAC GAAGAAGGTT ATGTTGATGG AAAATGGGGA TTTATTAACA GTACCGGCAA AGTCGTTGTC CCAATTATTT ATGACAAAGT GTGTGATTTC AGTGAAAGTC TTGCCGCCGT TGTAAAAGAC GGTAAGCTGG GATTTGTTGA CACAACCGGT AAAGTTGTTA TCCCTCTTAC ATACGACTGT TCAATGTATG AAGAATTTTA TTTCAACCAT GGTCTTGCAG TTGTAGCAGC CGGTGACTTT GAAGACCCTG ATATCTTTGT TATTGATGAA AACGGCAACA CTGCCTTTGA TTTCAAATAC GATTATGCGA GATTATCCGG CTACTCTGAA GGGATGTTGG CTGTTGCTGT CGGTGGAGAT TGGGGTATCG GCGGTCCGTA TTACTGGGCC CGCTCCTACA ACTTTGGCAA GTATGGTTTT ATAGACACCA ACGGTAACGA AATTGTTGCA CCTGTTTACG ATTATGCCAG TGATTTCCGG GAAGGCATTG CAGTGGTATG CAAAGACGGT AAATGGGGAG CCATTGATGA AAACGGCGAT GTTGTTGTTC CAATAATCTA CACCGATATC AGTTTTCCAA GCAATGGGCT TCTTTGTGTG TGTAATGATG AGGGAAAATG GGGATATGTT GACTACACCG GGAAAGTAGT TGCAGATTTC AAGTTCGATT TATGTAATAC CTTCCGTGAA GGATATGTCA CCAATTCCAT TGACGGAAAA TGGGGTGCTC TGGATACTAC AGGTAAAACT GTAATTCCAT TCAAATATTA TAATTTGTGG AATTTTAGTG AAGGTCTTGT CAGAGTCAGA ATGGTTGAAG ATGGCAAATG GGGTTATATG GACGCCGGTG GAAATATTGC AATTGACCCT GTTTATCAGG CAGCCACTGA CTTTTCTGAC GGAGTTGCCA TTGTTGTGAA AGATGGTAAA TTTGGAATCA TTTCTAAAAC ATCCATTCCT TACACGGCAA CCCCAACAAC TTCAACCGTG CTTGTGAACG GTTCCCCTGT GGCTTTTGAC GCTTATCTCA TCAACGGCAA CAACTACTTT AAGCTTCGGG ATCTGGCTTA CATATTAAGC GGTACGGACA AGCAGTTTGA AGTAACCTGG GACGATACTC TTAAAGCAAT AAATCTGATA TCCGGCAAAC CCTATACCGT ATCCGGCGGT GAAATGGCAA CCGGCAGCAA GAACACTGCC ACCGCATACC CTTCAGCTGC AACTGTTTTC ATTAATGGCG TGAGAGTTGA GTTAACAGCT TATACAATTA ACGGATTCAA TTACTTCAAA CTCCGTGACC TCGGTAAAGA ATTCGACTTT GGCGTAATCT GGGACGGCAC TGCCAAAACA ATACGTATAG ACACCTCCAG CAGCTATAGT GAATAA
|
Protein sequence | MKKHLLIIVS ALVSVLSLNL IAVTAAEPTV KNIYYDDVYS FSEGMSRVVK NGKYGFMDQT GNVAINLEYD YVRDFSDGFA AVSKGGTWYD EEGYVDGKWG FINSTGKVVV PIIYDKVCDF SESLAAVVKD GKLGFVDTTG KVVIPLTYDC SMYEEFYFNH GLAVVAAGDF EDPDIFVIDE NGNTAFDFKY DYARLSGYSE GMLAVAVGGD WGIGGPYYWA RSYNFGKYGF IDTNGNEIVA PVYDYASDFR EGIAVVCKDG KWGAIDENGD VVVPIIYTDI SFPSNGLLCV CNDEGKWGYV DYTGKVVADF KFDLCNTFRE GYVTNSIDGK WGALDTTGKT VIPFKYYNLW NFSEGLVRVR MVEDGKWGYM DAGGNIAIDP VYQAATDFSD GVAIVVKDGK FGIISKTSIP YTATPTTSTV LVNGSPVAFD AYLINGNNYF KLRDLAYILS GTDKQFEVTW DDTLKAINLI SGKPYTVSGG EMATGSKNTA TAYPSAATVF INGVRVELTA YTINGFNYFK LRDLGKEFDF GVIWDGTAKT IRIDTSSSYS E
|
| |