Gene Cthe_0640 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_0640 
Symbol 
ID4808169 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp791133 
End bp792881 
Gene Length1749 bp 
Protein Length582 aa 
Translation table11 
GC content43% 
IMG OID640106054 
Productcellulosome enzyme, dockerin type I 
Protein accessionYP_001037068 
Protein GI125973158 
COG category[M] Cell wall/membrane/envelope biogenesis 
COG ID[COG5434] Endopolygalacturonase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones24 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGCCAAAAA AAGTATTATG CGTTGTAATG ATTTTTGTGC TGTTTATGAG TTTTACTATA 
AGTACAGCGG CTTACAACGC GGAAATAAAC GGTGAAGTTA TTGTGTGGAA CCCGGGAATA
AAAGGAGGAA TCCCTACAAA GCCTGTTGTG GCCAATGTTA AGGATTTTGG AGCAAAGGGT
GATGGTCTGA CCGATGATTC AAATGCTTTT AAAAAAGCCG TTGAATCTGT GAAAGACGGA
GGAGCCGTTC TGATTCCCTC GGGAGAATAT CTTATAAAAT CCAAAATTAC ATTGGACAAG
CCTGTTGTTT TAAGAGGGGA AGGACCCGGA AAAACCATCC TTTTGATAGA TCACTCATCC
GACGCCTTTG AGGTCATAAC TTACAAAAGG GGAAACTGGG TAAGTCTTGT CGGAGGATAT
ACCCGCGGCT CGACGGAACT GGTGGTATCC GATCCGACAG GCTTTGAAGC CGGCAAATAT
GTGGAAATAC AGCAGGATAA CGACCCTGAC GTAATGTATA CTTTGCCGGA ATGGAACCAA
GGGTGGGCGG CAGGCAGCGT AGGACAGATA ACAAAGGTGG TTTCGATCTG GGGCAATAAA
ATAACCATTG AAGAGCCTCT CAGAATCACA TACAGATCGG AATTAAATCC TGTAATCAGA
ACTCAAGGAT TTGCGGAATA TATCGGGTTT GAAGATTTTA CGGTGAAGCG TATTGATACC
AGCGACACAA ATATGTTTTT CTTTAAGAAT GCGGCCAACT GCTGGATAAA AAATATCCAC
AGCATTAAGC CTGCAAAAGC CCATGTCAGT GTCACCACAG GATACAGGAT TGAAGTTAGG
GACAGCTTTT TCGACGATGC GACAAATTGG GGCGGAGGCG GGCACGGATA TGGTGTGGAA
CTTGGATTTC ATGTATCGGA CTGCCTGATT GAGAACAACA TTTTCAAGCA TTTGAGACAT
TCGATGATGG TGCACCTTGG CGCCAATGGA AATGTTTTTG GCTACAATTA TTCAACACAG
CCCTATCAGA GTGAAGGAGG CAACTGGACA CCGGCCGATA TTTCAGTGCA TGGACACTAT
GCCTATTCCA ATTTGTTTGA AGGAAATATA GTGCAGGAAA TAACTGTTTC CGATTATTGG
GGACCTTCCG GTCCGTACAA CACATTCCTT AGAAACAGAA TAGAGTCAGA AAGTGTATGC
CTTGAGGATT CATCAAACTA TCAGAACTTT ATCGGCAATG AAATTGTAAA CGGCAATATC
CTTTGGGATA CCGACAATAG ATATCCGCAT AAAATAGATC CTTCCACGCT GTTTTTACAT
GGCAATCTCA TAAATGGTTC AATTCAGTGG AATCAACAAA CTCAGGACCG TACAATACCA
AATTCTTATT ATCTTGACTC AAAACCGGCT TTCTTTGGAG GTATTAACTG GCCGTCAACG
GGAAGTGACC GAACAGATGG CACCATCCCT GCAAAAGAAA GATATTATGG AAATACAATT
CCGCCTGCTT CACCGACACC CGACAAGCCC GTTAAATATG GTGACTTAAA CGGCGATAAC
AATGTCAACT CAACGGATTT GACACTGCTG AAGAGATACC TGACAAGGGT CATTAATGAT
TTCCCTCATC CGGACGGCAG TGTAAATGCT GACGTAAACG GAGACGGAAA AATAAACTCC
ACTGATTATT CAGCAATGAT AAGGTATATT TTGAGAATAA TTGATAAGTT TCCTGCCGAA
AAGAGTTAA
 
Protein sequence
MPKKVLCVVM IFVLFMSFTI STAAYNAEIN GEVIVWNPGI KGGIPTKPVV ANVKDFGAKG 
DGLTDDSNAF KKAVESVKDG GAVLIPSGEY LIKSKITLDK PVVLRGEGPG KTILLIDHSS
DAFEVITYKR GNWVSLVGGY TRGSTELVVS DPTGFEAGKY VEIQQDNDPD VMYTLPEWNQ
GWAAGSVGQI TKVVSIWGNK ITIEEPLRIT YRSELNPVIR TQGFAEYIGF EDFTVKRIDT
SDTNMFFFKN AANCWIKNIH SIKPAKAHVS VTTGYRIEVR DSFFDDATNW GGGGHGYGVE
LGFHVSDCLI ENNIFKHLRH SMMVHLGANG NVFGYNYSTQ PYQSEGGNWT PADISVHGHY
AYSNLFEGNI VQEITVSDYW GPSGPYNTFL RNRIESESVC LEDSSNYQNF IGNEIVNGNI
LWDTDNRYPH KIDPSTLFLH GNLINGSIQW NQQTQDRTIP NSYYLDSKPA FFGGINWPST
GSDRTDGTIP AKERYYGNTI PPASPTPDKP VKYGDLNGDN NVNSTDLTLL KRYLTRVIND
FPHPDGSVNA DVNGDGKINS TDYSAMIRYI LRIIDKFPAE KS