Gene Cthe_0044 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_0044 
Symbol 
ID4808809 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp56725 
End bp58326 
Gene Length1602 bp 
Protein Length533 aa 
Translation table11 
GC content44% 
IMG OID640105453 
Productcellulosome enzyme, dockerin type I 
Protein accessionYP_001036478 
Protein GI125972568 
COG category[M] Cell wall/membrane/envelope biogenesis 
COG ID[COG5337] Spore coat assembly protein 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones18 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGCGTAAAT TTTTTAAGTT GACGGCGCTG ACAATAAGCA TGATGCTTCT TGTTGTGTGT 
CTGACCGGAC CACAAGTTTC GGCTGCATCC AGACCTGAGG GCTGGACGGA AGAAACCCAC
GGCAAAAAAG CAACGCCCAA CTATTCAGTG GTATTTCCCG AAGACAAAGT AAACCGTATC
GATATCATCA TCAGTCCCGA AAATTTTCAG CGCATGGAAA ATGACGTCTT CAAAGTATTT
ATGATGAGCA ACGAAGATCC TATTTATGTG TCTGCAACGG TAAAATTCAA CAATCACACC
TGGTGGCACG TGGGAATCAG ATACAAGGGA CAGAGTACAT TGACCGGGGC CATGATGAGC
ATGAGCCACA AATATCCTTT CCGCTTAAAC TTTGACAAGT TCGAAGATGA CTATCCTGAA
ATAGACAACC AAAGATTTTA CGGCTTTGAC GAACTCATAT TCAACAACAA CTGGTATGAC
CCTTCATTCC TGAGGGATAA ACTCACCAGT GACATTTTCC GCGATGCCGG CATTCCCGCG
CCTCGCTGTG CTTTCTACAG AGTGTACGTT GATACGGGAA ACGGCCCTGT TTATTGGGGA
TTATACACAG TGTTCGAGGA TCCTTCCGAC AAGATGCTCG AATATCAATT CGAAAACCCT
AACGGAAATC TCTATAAAGG TCAGCAAGCA CCCGGAGGAG ACCTGACAAT ATTTGATAAA
CGCGGATATG AAAAGAAAAC CAATGAAAAA GCTGACGACT GGAGTGATCT TCAAGCTCTT
GTCGCCGCAT TGAACGCTCC CAAAACCGAC CCTGCCAAAT GGAGAGCGGA CCTTGAAGCG
GTTTTCAATA CCGATTCTTT CTTAAAATGG CTGGCAATCA ATACTACTAT CGTAAATTTT
GACACTTATG GCTGGGTCAC AAAAAACCAT TATCTTTACC AAGATTTGGC CGATAACGGT
CGTTTGGTAT TTATCCCATG GGATTATAAC CTCTCCCTTT CCTCCACCAA TCCATGGGGC
ATAAAACCGC CCAGCTTCTC TTTGGATGAA ATTGGCAGAA ACTGGCCGCT GATTCGCAAT
TTAATAGATG ATCCGGTCTA CAAGCACATC TACCACACCG AAATAGAAAA CACACTGAAC
ATATACTTCA GGGAATTTAA TGTAATCGAA AAAGCTCGCA GGCTTCACGA GCTAATCCGC
CCATACACCG TGGGAAGCGA GGGAGAAATA AAAGGCTACA CTTATCTCAC CAACGGTGAA
GCACAATTTA ACCAGGCTCT CACCCAACTT ATAGAGCATA TCAGCACAAG GCACAGAGAA
GCCAGGTCTT ATCTGTCCTC GGTTAATTAT TATACACCTA TTCCTGAACG AACTCCGACA
CCCTTCCCAA GTCCAACACC AAAAAAGCCA AAGGGCGATA TCAATCTCGA CGGCAAGATA
AACTCGACAG ATTTGTCCGC ACTTAAAAGG CATATTCTCA GAATAACGAC TCTCTCCGGC
AAACAACTTG AAAACGCCGA TGTAAATAAT GACGGTTCGG TAAACTCTAC TGATGCTTCA
ATATTAAAGA AATATATTGC AAAAGCCATT CCATCCTTAT AA
 
Protein sequence
MRKFFKLTAL TISMMLLVVC LTGPQVSAAS RPEGWTEETH GKKATPNYSV VFPEDKVNRI 
DIIISPENFQ RMENDVFKVF MMSNEDPIYV SATVKFNNHT WWHVGIRYKG QSTLTGAMMS
MSHKYPFRLN FDKFEDDYPE IDNQRFYGFD ELIFNNNWYD PSFLRDKLTS DIFRDAGIPA
PRCAFYRVYV DTGNGPVYWG LYTVFEDPSD KMLEYQFENP NGNLYKGQQA PGGDLTIFDK
RGYEKKTNEK ADDWSDLQAL VAALNAPKTD PAKWRADLEA VFNTDSFLKW LAINTTIVNF
DTYGWVTKNH YLYQDLADNG RLVFIPWDYN LSLSSTNPWG IKPPSFSLDE IGRNWPLIRN
LIDDPVYKHI YHTEIENTLN IYFREFNVIE KARRLHELIR PYTVGSEGEI KGYTYLTNGE
AQFNQALTQL IEHISTRHRE ARSYLSSVNY YTPIPERTPT PFPSPTPKKP KGDINLDGKI
NSTDLSALKR HILRITTLSG KQLENADVNN DGSVNSTDAS ILKKYIAKAI PSL