Gene Cthe_1471 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_1471 
Symbol 
ID4810621 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp1787724 
End bp1789409 
Gene Length1686 bp 
Protein Length561 aa 
Translation table11 
GC content40% 
IMG OID640106892 
Productglycoside hydrolase family protein 
Protein accessionYP_001037893 
Protein GI125973983 
COG category[G] Carbohydrate transport and metabolism 
COG ID[COG2730] Endoglucanase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones29 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAAATGCA GATATATGGA TTATTTAATA AACTGTGAGT TGGACTCTTT GCCTAATGAA 
AAATCCGAAA AAATAAACAA ACATATAAGC ACCTGCATGG ACTGCAGAAA TTATCTGGGT
GCACTTATGA TATCCAAAAA ATATATTGCC AAAGAACCGG AAATGGATAA ATATTTTTAC
ATGCGGGTTA TCAATGCCAT AGACCCTGAC AGATACAAGA AATCGAAACT GACTTTTAAG
ATTCTGTCTT CATTGGAAAG GCTAAAGCCT GCTTTTAAAG CGTCTTTGGG AACTTTGGCA
ATCTTTGTGG CAGTAGCTTT GTTAATAACC GGCGGAATTT TTGACAATCT CGGCAACTGG
ATTGCCAAAA GCAGCAATAA CCGCTCCAAT ACCGGTGAAA CAACAAACCT GACCTTTTTG
CGTACAGACG GTCAGAATAT TGTTACCGGT ACCGGTGAAA TCTTCCACAT AAGAGGTGTT
ACCCTTACAA ACAACTTTTG GGGCAACTGG GTCAACGGAG AATCGGAAAA ATTGCAAAGT
CAAGGTATGG ACCCTATTAT ACGCCCTCTT GTACAGGATG CCTGGGTGCT CACTGATGAT
GATTTTGAGC GTATTAAAGA CCTGGGCTGC AACACGGTTT TATATGATAT CAATTACCAG
CTTTTTGCAG AAGACAATCC AAACAGGGAA GAAAATCTCA AAAAGCTCAA AGAACATATA
AGGCGTTTTT CCTCAATGGA CATATATACG GCTGTTATGC TAATGGCTCC TCCGGGACTG
GATTCGATCA ATGACGCCTA CGAAAAGTAC AAACACGGCT CAGAACGTAT AAAATCTGTG
TTTGAGGATG ATACCTACTA CGAACAGTGG GTTGAGATGT GGAAGTATTT GGCCGAAGAA
CTTAAAGACT TTAAAGGTGT GGCAGGATAT GGACTTATAA ACCAGCCAAG AGCCCCGAGT
GAAAGTGAAG GTGGAATCGG GATATTCAGG GAACGCCTGA ACAATGTATG CAGAGAAATA
CGTAAAATTG ACAAAAATCA TATCATATTT GTTCCCGAAT ATAACAGCAG AGAGGCCAAT
CCCGGCGAAT CCTACTGGAA CGAAAAAACA AATAGTTATG TAATAGACAA CGGTGAGCAA
GGTATTATCT GGGAAAGAGG TTTGGTAAAA GTTGATTCAT CAAACGTAGT ATACTTGTTC
CACTTTTTCG AACCATACAA CTTTGTCAAT GACGGTGTCG GAGATTTTGA TGCCGAAAGC
CTTGAAGCTC AAGTCAGAAA ACGTTATGAA TGGGCTAAAA ATGTCGGCAG GGCTCCGCTT
CTTACCGAAT ACGGAATCTC CCGGGTAAAC AGCGTAGACA AACGTGTACA ATGGCTTGAA
ACCGTTCACG ACATCTTTGA TAAATACGGT ATCTCGGCTT CATACTTCCA ATATAAAAAT
GCCGTAGGTG CTTTTATAAA TGTGAAAACC GGTTTTAACG CTTTATACGG AGAATATGTC
AGCTGGGATA GTGAAATCGG CCTGAATCCC TTTTACTTTG TAAATGAACA CGTTGCCACA
TCCGCAAAAG AAAATCATTT TGATGAAGCA CTTAAAGAGT ATTACCTTAA AGGTAAAAAC
CTGAAAAAAA TTTCAATACT GGACAATCAG CCCATTCTTG AAACATTGCA AAATTTTTGG
AAATAG
 
Protein sequence
MKCRYMDYLI NCELDSLPNE KSEKINKHIS TCMDCRNYLG ALMISKKYIA KEPEMDKYFY 
MRVINAIDPD RYKKSKLTFK ILSSLERLKP AFKASLGTLA IFVAVALLIT GGIFDNLGNW
IAKSSNNRSN TGETTNLTFL RTDGQNIVTG TGEIFHIRGV TLTNNFWGNW VNGESEKLQS
QGMDPIIRPL VQDAWVLTDD DFERIKDLGC NTVLYDINYQ LFAEDNPNRE ENLKKLKEHI
RRFSSMDIYT AVMLMAPPGL DSINDAYEKY KHGSERIKSV FEDDTYYEQW VEMWKYLAEE
LKDFKGVAGY GLINQPRAPS ESEGGIGIFR ERLNNVCREI RKIDKNHIIF VPEYNSREAN
PGESYWNEKT NSYVIDNGEQ GIIWERGLVK VDSSNVVYLF HFFEPYNFVN DGVGDFDAES
LEAQVRKRYE WAKNVGRAPL LTEYGISRVN SVDKRVQWLE TVHDIFDKYG ISASYFQYKN
AVGAFINVKT GFNALYGEYV SWDSEIGLNP FYFVNEHVAT SAKENHFDEA LKEYYLKGKN
LKKISILDNQ PILETLQNFW K