Gene Cthe_3012 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_3012 
Symbol 
ID4811160 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp3535136 
End bp3537028 
Gene Length1893 bp 
Protein Length630 aa 
Translation table11 
GC content43% 
IMG OID640108433 
Productcarbohydrate-binding family 6 protein 
Protein accessionYP_001039401 
Protein GI125975491 
COG category[M] Cell wall/membrane/envelope biogenesis 
COG ID[COG5520] O-Glycosyl hydrolase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones33 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGCGTAAGT TAAGAAAATT GCTGCTATTT TCTACTGTTT TGTTTGTCGT TTTTACTCAA 
CTATTTGGCT TTATAATCAC GGTAGATGCG GCAGAAACGG CAACAATCAA CTTGTCGGCG
GAAAAACAGG TAATCCGCGG ATTTGGAGGA ATGAACCATC CCGTTTGGAT TTCCGACCTG
ACGCCGCAGC AAAGGGATAC GGCTTTTGGT AATGGAGAGG GACAGCTGGG CTTTACTATT
TTGAGAATTC ATGTCGATGA GAACAGAAAC AATTGGTCAA AAGAAGTGGC AACCGCCAGA
AGAGCTATTG AGCTTGGAGC AATAGTTTTT GCTTCTCCGT GGAATCCTCC AAGTAATATG
GTGGAGACCT TCACTCGTAA CGGTGTGCCA AATCAAAAGA GACTCAGATA TGATAAATAC
GGAGATTATG TACAGCATCT TAACGACTTT GTTGCGTACA TGAAAAGTAA CGGAGTGGAT
TTGTATGCCA TTTCAGTTCA GAACGAACCA GACTATGCCC ATGAGTGGAC ATGGTGGACT
CCTCAGGAAA TGCTTCGCTT TATGAGGGAC TATGCCGGCC AAATCAACTG CAGGGTTATG
GCGCCGGAGT CATTCCAATA TCTGAAAAAT ATGTCCGACC CGATTTTGAA TGACCCTCAG
GCTCTTGCGA ATTTGGATAT ACTTGGTGCC CACTTTTACG GCACTACTGT AAACAATATG
CCCTATCCTT TGTTTGAGCA AAAAGGAGCG GGAAAAGAGC TGTGGATGAC AGAGGTTTAT
GTTCCAAACA GCGACAGCAA TTCGGCAGAC CGCTGGCCTG AGGCACTAGA GGTTGCGCAT
AATATGCACA ATGCTTTGGT AGAGGGAAAT TTCCAGGCAT ATGTTTGGTG GTATATCCGC
AGGTCATACG GACCTATGAA AGAAGACGGT ACTATAAGCA AGCGCGGATA TATGATGGCA
CATTACTCAA AGTTTGTCCG CCCGGGATAT GTAAGGGTTG ATGCAACGAA AAATCCTACA
TACAATGTAT ATTTATCTGC TTACAAAAAC AAAAAAGATA ACAGCGTTGT GGCAGTGGTT
ATAAATAAAA GTACCGAGGC GAAGACAATT AATATATCCG TTCCGGGAAC AAGTATCAGA
AAGTGGGAAA GATATGTTAC TACAGGGTCA AAAAATCTTA GGAAAGAATC AGACATAAAT
GCAAGTGGAA CCACTTTCCA GGTTACTTTG GAGCCTCAAA GCGTTACAAC TTTTGTAGGC
GGTGGATCCA GTGAACCGCA AATACCGGTT GAAAGAAATG CTTTCTCAAA GATAGAATGC
GAAGAATATA ACGCTACCAA TTCTTCCACT GTACAAGTAG TGGGTACCGG CACAGGAAGC
GGTCTCGGAT ATATCGAAAA CGGCAACTAT TTTGCTTACA AAAATATTAA TTTCGGTAAC
GGTGCAAATT CATTCAAAAT CAGGGCTGCA ACTACCGGTA CTCCAAAAAT AGAAATCCGA
CTGGGCAGTC CGACAGGCAC TCTTGCAGGT ACATTGCAAG TGGCTGCAAC CGGAGGCTTT
AATGCCTATG AGGAGCAGAG CTGCAGTATT AATAAAATTA CGGGTGTCCA GGACGTCTAT
TTGGTATTCG GAGGAGCTGT AAATGTTGAC TGGTTTACCT TTGAGTCAAA ACAGGAGCCG
ACTTTCAAGT ACGGCGACCT CAACGGTGAC GGCAATGTTA ACTCCACTGA TTCCACGCTT
ATGTCAAGAT ATCTTTTAGG TATAATCACC ACTTTGCCGG CCGGTGAAAA GGCTGCGGAT
TTGAATGGGG ACGGAAAGGT AAATTCTACA GACTACAATA TTTTAAAGAG ATATTTGCTT
AAATATATTG ATAAATTTCC TGTAGAATCA TAA
 
Protein sequence
MRKLRKLLLF STVLFVVFTQ LFGFIITVDA AETATINLSA EKQVIRGFGG MNHPVWISDL 
TPQQRDTAFG NGEGQLGFTI LRIHVDENRN NWSKEVATAR RAIELGAIVF ASPWNPPSNM
VETFTRNGVP NQKRLRYDKY GDYVQHLNDF VAYMKSNGVD LYAISVQNEP DYAHEWTWWT
PQEMLRFMRD YAGQINCRVM APESFQYLKN MSDPILNDPQ ALANLDILGA HFYGTTVNNM
PYPLFEQKGA GKELWMTEVY VPNSDSNSAD RWPEALEVAH NMHNALVEGN FQAYVWWYIR
RSYGPMKEDG TISKRGYMMA HYSKFVRPGY VRVDATKNPT YNVYLSAYKN KKDNSVVAVV
INKSTEAKTI NISVPGTSIR KWERYVTTGS KNLRKESDIN ASGTTFQVTL EPQSVTTFVG
GGSSEPQIPV ERNAFSKIEC EEYNATNSST VQVVGTGTGS GLGYIENGNY FAYKNINFGN
GANSFKIRAA TTGTPKIEIR LGSPTGTLAG TLQVAATGGF NAYEEQSCSI NKITGVQDVY
LVFGGAVNVD WFTFESKQEP TFKYGDLNGD GNVNSTDSTL MSRYLLGIIT TLPAGEKAAD
LNGDGKVNST DYNILKRYLL KYIDKFPVES