Gene Cthe_2872 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_2872 
Symbol 
ID4809152 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp3392182 
End bp3393882 
Gene Length1701 bp 
Protein Length566 aa 
Translation table11 
GC content44% 
IMG OID640108291 
Productglycoside hydrolase family protein 
Protein accessionYP_001039263 
Protein GI125975353 
COG category[G] Carbohydrate transport and metabolism 
COG ID[COG2730] Endoglucanase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones19 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAAAAAAG CAAAAGCAAT TTTTTCGCTG GTTGTTGCTT TGATGGTATT GGCCATTTTC 
TGCTTTGCAC AGAATACCGG TTCAACAGCT ACGACAGCAG CGGCCGCCGT CGACAGCAAC
AACGATGACT GGTTGCACTG TAAAGGCAAC AAAATTTACG ACATGTATGG AAACGAAGTC
TGGCTCACAG GCGCCAACTG GTTTGGTTTC AACTGCAGTG AAAACTGTTT CCACGGTGCC
TGGTATGATG TTAAAACCAT TCTGACCAGC ATTGCAGACA GAGGTATCAA CCTTCTGAGA
ATACCAATTT CAACAGAGCT TCTGTACAGC TGGATGATTG GAAAACCCAA CCCGGTTTCC
AGTGTTACCG CCAGCAACAA TCCTCCATAT CATGTGGTCA ACCCTGACTT TTATGATCCC
GAAACCGACG ATGTAAAAAA CAGTATGGAA ATCTTCGATA TCATAATGGG ATACTGCAAA
GAACTCGGAA TAAAAGTAAT GATTGATATA CACAGTCCTG ACGCCAACAA CTCCGGACAC
AACTATGAGT TGTGGTACGG AAAGGAAACA AGCACCTGTG GTGTGGTTAC CACAAAGATG
TGGATAGACA CTTTGGTTTG GCTTGCCGAC AAGTACAAAA ATGATGACAC CATAATAGCT
TTTGACTTGA AAAACGAGCC TCACGGAAAG CGTGGATATA CGGCTGAGGT GCCTAAATTG
CTTGCAAAAT GGGACAATTC CACAGACGAA AACAACTGGA AATACGCTGC CGAAACCTGC
GCAAAAGCTA TTTTGGAAGT AAATCCTAAA GTGCTTATAG TAATTGAAGG TGTTGAACAA
TATCCTAAAA CTGAGAAAGG TTATACCTAT GACACACCGG ATATCTGGGG AGCAACAGGA
GATGCTTCTC CATGGTACAG TGCATGGTGG GGAGGAAACT TAAGAGGTGT TAAAGACTAT
CCTATCGATT TAGGTCCTTT GAACAGCCAG ATTGTATATT CTCCGCATGA CTACGGTCCT
TCAGTATATG CTCAGCCGTG GTTTGAAAAA GACTTTACAA TGCAAACCTT GTTGGATGAC
TATTGGTATG ATACCTGGGC ATATATTCAC GATCAGGGAA TCGCTCCAAT CCTAATCGGT
GAATGGGGCG GACATATGGA CGGCGGCAAA AACCAAAAAT GGATGACATT GCTCAGAGAT
TACATAGTTC AAAACCGCAT TCATCATACA TTCTGGTGCA TAAATCCCAA CTCCGGTGAC
ACCGGCGGAT TGCTCGGAAA CGACTGGTCA ACTTGGGATG AGGCAAAATA CGCATTGTTA
AAACCTGCTT TGTGGCAGAC CAAGGACGGA AAGTTCATTG GACTTGACCA CAAAATTCCT
CTCGGTTCAA AGGGAATTTC CCTGGGCGAG TACTACGGAA CACCACAAGC TTCAGATCCT
CCGGCAACAC CGACTGCGAC GCCTACAAAG CCTGCAGCTT CATCGACTCC TTCATTTATT
TACGGCGACA TTAACAGCGA TGGTAATGTC AATTCAACAG ATCTTGGTAT ATTAAAGAGA
ATAATAGTAA AAAATCCTCC GGCAAGTGCC AACATGGATG CTGCAGACGT AAATGCGGAC
GGTAAAGTGA ATTCAACTGA CTATACCGTT CTAAAGAGAT ATTTGCTGAG GTCAATTGAC
AAACTGCCGC ACACCACCTG A
 
Protein sequence
MKKAKAIFSL VVALMVLAIF CFAQNTGSTA TTAAAAVDSN NDDWLHCKGN KIYDMYGNEV 
WLTGANWFGF NCSENCFHGA WYDVKTILTS IADRGINLLR IPISTELLYS WMIGKPNPVS
SVTASNNPPY HVVNPDFYDP ETDDVKNSME IFDIIMGYCK ELGIKVMIDI HSPDANNSGH
NYELWYGKET STCGVVTTKM WIDTLVWLAD KYKNDDTIIA FDLKNEPHGK RGYTAEVPKL
LAKWDNSTDE NNWKYAAETC AKAILEVNPK VLIVIEGVEQ YPKTEKGYTY DTPDIWGATG
DASPWYSAWW GGNLRGVKDY PIDLGPLNSQ IVYSPHDYGP SVYAQPWFEK DFTMQTLLDD
YWYDTWAYIH DQGIAPILIG EWGGHMDGGK NQKWMTLLRD YIVQNRIHHT FWCINPNSGD
TGGLLGNDWS TWDEAKYALL KPALWQTKDG KFIGLDHKIP LGSKGISLGE YYGTPQASDP
PATPTATPTK PAASSTPSFI YGDINSDGNV NSTDLGILKR IIVKNPPASA NMDAADVNAD
GKVNSTDYTV LKRYLLRSID KLPHTT