Gene Cthe_2147 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_2147 
Symbol 
ID4811195 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2551539 
End bp2553521 
Gene Length1983 bp 
Protein Length660 aa 
Translation table11 
GC content43% 
IMG OID640107551 
Productglycoside hydrolase family protein 
Protein accessionYP_001038543 
Protein GI125974633 
COG category[G] Carbohydrate transport and metabolism 
COG ID[COG2730] Endoglucanase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones16 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGGGTCAAA AACATTTTAA AAGAAGTCTG TTGTCTGTAT TAACAATTTC AGCCCTGATT 
ATCTCCTGCC TTTTCAGTTT TATTTTCGTT AACGCAGATG ACACTTCTGA AGAACCCGCT
TTGGAAGGCC TGTCTATACA CTATATGGAC GGCACACTGG ACGTAAAATA TCAGAGCATG
CGTCCTTACA TAATTATTCA TAACAACAGC GGCATGGATG TGGACATGGC CGACCTTAGG
GTAAGGTATT ACTACGAAAA AGAGGGTGTT ACCGAAGAAG TCCTTACATG CTTTTATACA
GCAATAGGTG CGGACAAAAT ATTTGCCGAA TTTCATCCCG AGCTGGGATA CGCTGAAATC
GGCTTTACCA GTGATGCCGG AATTATAAAA AGCGGTGGCA ACAGCGGGCA GCTGCAGCTG
GTACTGAAAA AAATATCGAA CGGTTACTAC GACCAAAGCA ATGATTATTC TTATGACCCA
AGTTACACTG ATTATGCAGA ATATGATAAA ATAACACTCT ATTACAAAGG TAAACTGGTA
TGGGGAAAAG AAGGACCTCC TCCGCCACCG GAACCGACAC CTCCGCCAAA CAACGACGAC
TGGCTTCATG TGGAGGGAAA TCTAATCAAG GATGCCCAGG GAAATACCGT TTATCTTACA
GGAATCAACT GGTTTGGATT TGAAACTGAC GGAGCAAACG GTTTTTACGG CCTTAACAAA
TGCAACCTTG AGGATTCTCT TGATTTAATG GCAAAATTAG GTTTTAATAT TCTCAGAATC
CCCATCAGTG CTGAAATTAT TCTGCAATGG AAAAACGGCG AACGTGTAGA AACTTCCTTT
GTAAACACCT ATGAAAACCC GCGTCTTGAC GGCCTCAGCA GTCTTGAAAT ACTGGACTAT
ACAATAAATC ACATGAAGAA AAACGGTATG AAAGCCATGC TTGACATGCA CAGTTCAACC
AAGGACTCAT ACCAGGAAAA CCTCTGGTAT AACAAGGATA TAACCATGGA AGAATTTATC
GAAGCCTGGA AATGGATTGT CGAAAGATAC AAAGACGATG ATACGGTCAT TGCAGTGGAT
CTCAAAAATG AACCTCATGG AAAGTACTCC GGTCCGAATA TCGCCAAATG GGATGATTCG
GATGATCCAA ACAACTGGAA AAGGGCGGCG GAAATCATTG CCGAAGAAAT TCTTGCAATC
AATCCAAATC TTTTAATCGT CGTAGAGGGT GTTGAGGCAT ACCCGATGGA AGGGTATGAT
TACACCAACT GCGGTGAGTT TACCACATAC TGTAACTGGT GGGGCGGAAA TTTAAGAGGA
GTTGCCGACC ATCCTGTTGT CATATCCGCT CCGGACAAGC TCGTATATTC CGTACATGAT
TACGGACCGG ACATCTATAT GCAGCCGTGG TTTAAAAAAG ATTTCGACAT TAACACCCTT
TATGAGGAAT GCTGGTACCC AAACTGGTAC TACATTGTCG AGCAAAATAT TGCGCCTATG
TTAATCGGCG AATGGGGAGG CAAGCTTATC AATGAAAACA ACCGGAAGTG GCTTGAATGT
TTGGCTACCT TTATTTCAGA AAAGAAACTG CATCATACCT TCTGGGCTTT TAATCCCAAC
TCAGCCGACA CCGGCGGTCT AATGCTTGAG GATTGGAAAA CCGTTGATGA GGAAAAATAT
GCAATAATTG AGCCCACATT GTGGAAGAAA GGTCTGGATC ATGTAATACC GCTGGGAGGA
ATTACGGAGG ATACCTTTAA ATATGGTGAC GTTAACGGTG ATTTTGCCGT AAACTCCAAC
GACCTTACAT TGATAAAACG CTACGTCCTT AAAAATATTG ACGAATTCCC CTCTTCTCAT
GGATTGAAAG CTGCCGACGT GGACGGAGAT GAAAAAATAA CCTCCAGTGA TGCTGCTCTT
GTAAAAAGGT ACGTTCTAAG AGCCATAACA TCATTCCCGG TGGAAGAAAA CCAAAATGAA
TAA
 
Protein sequence
MGQKHFKRSL LSVLTISALI ISCLFSFIFV NADDTSEEPA LEGLSIHYMD GTLDVKYQSM 
RPYIIIHNNS GMDVDMADLR VRYYYEKEGV TEEVLTCFYT AIGADKIFAE FHPELGYAEI
GFTSDAGIIK SGGNSGQLQL VLKKISNGYY DQSNDYSYDP SYTDYAEYDK ITLYYKGKLV
WGKEGPPPPP EPTPPPNNDD WLHVEGNLIK DAQGNTVYLT GINWFGFETD GANGFYGLNK
CNLEDSLDLM AKLGFNILRI PISAEIILQW KNGERVETSF VNTYENPRLD GLSSLEILDY
TINHMKKNGM KAMLDMHSST KDSYQENLWY NKDITMEEFI EAWKWIVERY KDDDTVIAVD
LKNEPHGKYS GPNIAKWDDS DDPNNWKRAA EIIAEEILAI NPNLLIVVEG VEAYPMEGYD
YTNCGEFTTY CNWWGGNLRG VADHPVVISA PDKLVYSVHD YGPDIYMQPW FKKDFDINTL
YEECWYPNWY YIVEQNIAPM LIGEWGGKLI NENNRKWLEC LATFISEKKL HHTFWAFNPN
SADTGGLMLE DWKTVDEEKY AIIEPTLWKK GLDHVIPLGG ITEDTFKYGD VNGDFAVNSN
DLTLIKRYVL KNIDEFPSSH GLKAADVDGD EKITSSDAAL VKRYVLRAIT SFPVEENQNE