Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | Cthe_2872 |
Symbol | |
ID | 4809152 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Clostridium thermocellum ATCC 27405 |
Kingdom | Bacteria |
Replicon accession | NC_009012 |
Strand | - |
Start bp | 3392182 |
End bp | 3393882 |
Gene Length | 1701 bp |
Protein Length | 566 aa |
Translation table | 11 |
GC content | 44% |
IMG OID | 640108291 |
Product | glycoside hydrolase family protein |
Protein accession | YP_001039263 |
Protein GI | 125975353 |
COG category | [G] Carbohydrate transport and metabolism |
COG ID | [COG2730] Endoglucanase |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 19 |
Plasmid unclonability p-value | 1 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGAAAAAAG CAAAAGCAAT TTTTTCGCTG GTTGTTGCTT TGATGGTATT GGCCATTTTC TGCTTTGCAC AGAATACCGG TTCAACAGCT ACGACAGCAG CGGCCGCCGT CGACAGCAAC AACGATGACT GGTTGCACTG TAAAGGCAAC AAAATTTACG ACATGTATGG AAACGAAGTC TGGCTCACAG GCGCCAACTG GTTTGGTTTC AACTGCAGTG AAAACTGTTT CCACGGTGCC TGGTATGATG TTAAAACCAT TCTGACCAGC ATTGCAGACA GAGGTATCAA CCTTCTGAGA ATACCAATTT CAACAGAGCT TCTGTACAGC TGGATGATTG GAAAACCCAA CCCGGTTTCC AGTGTTACCG CCAGCAACAA TCCTCCATAT CATGTGGTCA ACCCTGACTT TTATGATCCC GAAACCGACG ATGTAAAAAA CAGTATGGAA ATCTTCGATA TCATAATGGG ATACTGCAAA GAACTCGGAA TAAAAGTAAT GATTGATATA CACAGTCCTG ACGCCAACAA CTCCGGACAC AACTATGAGT TGTGGTACGG AAAGGAAACA AGCACCTGTG GTGTGGTTAC CACAAAGATG TGGATAGACA CTTTGGTTTG GCTTGCCGAC AAGTACAAAA ATGATGACAC CATAATAGCT TTTGACTTGA AAAACGAGCC TCACGGAAAG CGTGGATATA CGGCTGAGGT GCCTAAATTG CTTGCAAAAT GGGACAATTC CACAGACGAA AACAACTGGA AATACGCTGC CGAAACCTGC GCAAAAGCTA TTTTGGAAGT AAATCCTAAA GTGCTTATAG TAATTGAAGG TGTTGAACAA TATCCTAAAA CTGAGAAAGG TTATACCTAT GACACACCGG ATATCTGGGG AGCAACAGGA GATGCTTCTC CATGGTACAG TGCATGGTGG GGAGGAAACT TAAGAGGTGT TAAAGACTAT CCTATCGATT TAGGTCCTTT GAACAGCCAG ATTGTATATT CTCCGCATGA CTACGGTCCT TCAGTATATG CTCAGCCGTG GTTTGAAAAA GACTTTACAA TGCAAACCTT GTTGGATGAC TATTGGTATG ATACCTGGGC ATATATTCAC GATCAGGGAA TCGCTCCAAT CCTAATCGGT GAATGGGGCG GACATATGGA CGGCGGCAAA AACCAAAAAT GGATGACATT GCTCAGAGAT TACATAGTTC AAAACCGCAT TCATCATACA TTCTGGTGCA TAAATCCCAA CTCCGGTGAC ACCGGCGGAT TGCTCGGAAA CGACTGGTCA ACTTGGGATG AGGCAAAATA CGCATTGTTA AAACCTGCTT TGTGGCAGAC CAAGGACGGA AAGTTCATTG GACTTGACCA CAAAATTCCT CTCGGTTCAA AGGGAATTTC CCTGGGCGAG TACTACGGAA CACCACAAGC TTCAGATCCT CCGGCAACAC CGACTGCGAC GCCTACAAAG CCTGCAGCTT CATCGACTCC TTCATTTATT TACGGCGACA TTAACAGCGA TGGTAATGTC AATTCAACAG ATCTTGGTAT ATTAAAGAGA ATAATAGTAA AAAATCCTCC GGCAAGTGCC AACATGGATG CTGCAGACGT AAATGCGGAC GGTAAAGTGA ATTCAACTGA CTATACCGTT CTAAAGAGAT ATTTGCTGAG GTCAATTGAC AAACTGCCGC ACACCACCTG A
|
Protein sequence | MKKAKAIFSL VVALMVLAIF CFAQNTGSTA TTAAAAVDSN NDDWLHCKGN KIYDMYGNEV WLTGANWFGF NCSENCFHGA WYDVKTILTS IADRGINLLR IPISTELLYS WMIGKPNPVS SVTASNNPPY HVVNPDFYDP ETDDVKNSME IFDIIMGYCK ELGIKVMIDI HSPDANNSGH NYELWYGKET STCGVVTTKM WIDTLVWLAD KYKNDDTIIA FDLKNEPHGK RGYTAEVPKL LAKWDNSTDE NNWKYAAETC AKAILEVNPK VLIVIEGVEQ YPKTEKGYTY DTPDIWGATG DASPWYSAWW GGNLRGVKDY PIDLGPLNSQ IVYSPHDYGP SVYAQPWFEK DFTMQTLLDD YWYDTWAYIH DQGIAPILIG EWGGHMDGGK NQKWMTLLRD YIVQNRIHHT FWCINPNSGD TGGLLGNDWS TWDEAKYALL KPALWQTKDG KFIGLDHKIP LGSKGISLGE YYGTPQASDP PATPTATPTK PAASSTPSFI YGDINSDGNV NSTDLGILKR IIVKNPPASA NMDAADVNAD GKVNSTDYTV LKRYLLRSID KLPHTT
|
| |