Gene Cthe_2038 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_2038 
Symbol 
ID4811008 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2421511 
End bp2423934 
Gene Length2424 bp 
Protein Length807 aa 
Translation table11 
GC content36% 
IMG OID640107445 
Productcellulosome enzyme, dockerin type I 
Protein accessionYP_001038440 
Protein GI125974530 
COG category 
COG ID 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones14 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAAAAAGG CCATCTCTCT CATTCTTACT TTATTGCTAA TCTTCAATTT TCTTCCGTTG 
AATTTTATCA TTGAGGCTTT TGCGGAAGAT GGTGGACAGC TTATTATCTA TCCCGAATAT
GATGAAAGAA TTCCACGCTG TTATGATTAT AGTGTAACAG TACATCAAGG CAATCAATCT
AAAACGATTC CCGTATATAA TCGCAATGCA AACGGAGAAC AGATGGCTTA CAGATGCTTA
TCACCTGACT TTAACAGAAG ATTCTGCGAA TTTGCTTTCA CAGGAGAGGT TCGGGTTGAT
ATTACCGTTT ATCGTGATTT TGAAACATAC AGTGTTTTGC CAAGTGCAAA GAGATATCGA
AACGAATTTC ATGATGGTGT TATCTCTGTG TGGCTTAATG AGAATGATAC AAAATTTATG
ATTCGCCTTG ATGACGATGA TGATACCATT CTTTCGGTTT TTGCAGATGC TCCCGAAGAT
TACGATATAG ACCTGAATGA CGAATCTGTT CTGTATGTGG ATGAGCCATG GTTTGACCCT
GATGAAAACA GTGCATATTA TACGCTTGAT GAAAAAATCA GAACGATATA TATTGCACCG
GGATGCGTGT TTTACAGTCG TCTGATTATT AAATCGAATA ATGTTACCAT TTGCGGACAT
GGAATCTTGC TTGACCCATT TAGTGACTTG CATGACGCAT CCGTGACAGA GGATAGAACG
AACATCTATA TGGACGTGAA CGGTAACAAT TTTACAATTA AAGATGTAAA AATTATAGAT
TCTCAAGGAT ACCATATGTA TCTTCTAGGT ACTAACCATC TCATCAAAAA TGTTAAGTGT
CTGACAGCAA GAATCCGTAC TGACGGTGTT GCAGTCGGTG CTGGAAATGT AACCATTACA
AATTGCTTTT GGTATGTCAG CGACAATGGA TTTACATATA GCGGCGGTTA TGGGTATCAT
CGCATCAGCA ACTGTATCAT GGGAACAACA TGTGCCGCAT TTTTTCCACA GCATACGCTT
CCCTATGATG TAGAATTTAC AGATATCTAT GTATTCCGTG CTGATGAGGG AATTATAAAT
AACTGGTATA ACGGAGCAAA GATACAATCT GTTGTAAAGA ACGTTACATT CAACAACCTG
GACTGTGTTG ATGTTATCAA TACGCCATGG ATTTTCAGTG GCAAGAATAT GGGCGATGCA
GTCAAGAATT TTTATTTTAA CAATTGTCGT TTCAACGCAA TCAGGGGTTC ATCCATTGTG
ACAGAATGGA ATACTAAAGC AGGACAAGCT ATTCATATAA TCAATAATGA CACACTTTTA
CATACATCAA ATTATAAACT GAATTTTAAA AACTGCTATA TTGACGGTAA ACTGATTACA
TCAGAAAGCG ACTTCAAGCC TCAGTATAAG GATAGTAATG AACTCACCAT TTCAATCAAT
AATGACGGTA CAAAACCGGA ATATCCTCTT TACTGTGTGC GAAACAAAGT GAACTATACA
TATAACAAAA AGGTATATAT CAACCATAAT TTGCAGAAAT TACAGCATCA GCCCATTGGC
GACGGAAGTG AAATATTATT ACCTGAAACT GAAATCTGTA ATTTACTGGA TATCAAAATA
AATGCAAACA CAAAAGGAAC AACGCAAAAC GGTATCAAAT ATATTTCTCT TGATGAAATA
AATCGTTATT ATACAAAAGC TGTATATGAC AGCGAAAAAT CGGCTGTGAT TCTCTCTCCT
GTAGTGGATA GCAGTAAAAA TCTATTGAAA GACTATTCCT TTGCATGTCG ATATAATCCT
TATAGCAATC CGGGTGCAAC ACTCAAACCA TATGTTGATA ATGGTGAAGT TGTATTGCGT
TGTATTGTTA CCAGTAATAT TTACAATCAA GGATTATACA CTGTTGTAAC AGATGAATTG
GAAAAATATG GTGCGGGGGT ATATACTATC AGCTTTGAAG CAAGAAGTTA TAACGGAAAC
AAAACTACGG TAGAGGTACG ACCACACTAT GTAAGATACG AAGGGTATTC TCTCATCGAA
AAAAACCCAA AAAAATCGGT AACAGTGAAT GGACAATGGC AGCGATATGA AGTAACCTTT
GATATCAGCG ACTGGGATTT AAGTGTCGGT GCCTCAGCGA TTATCCGTAT TTGTAGCGAC
AATACCCCAG GATACGACGT TTTATTCAAG AATATTGCAT TGACAAAAGT TGTTCCTGAA
GTTAAAAAAG GAGATATAGT TTTAGATGGA AATATAAATT CGCTTGACAT GATGAAACTT
AAAAAGTATT TGATTCGAGA AACTCAGTTT AATTATGATG AACTTTTAAG AGCTGATGTT
AATTCCGACG GTGAGGTAAA TTCTACTGAT TACGCATATT TAAAAAGATA TATACTTAGA
ATTATTGATG CTTTTCCGCA ATAA
 
Protein sequence
MKKAISLILT LLLIFNFLPL NFIIEAFAED GGQLIIYPEY DERIPRCYDY SVTVHQGNQS 
KTIPVYNRNA NGEQMAYRCL SPDFNRRFCE FAFTGEVRVD ITVYRDFETY SVLPSAKRYR
NEFHDGVISV WLNENDTKFM IRLDDDDDTI LSVFADAPED YDIDLNDESV LYVDEPWFDP
DENSAYYTLD EKIRTIYIAP GCVFYSRLII KSNNVTICGH GILLDPFSDL HDASVTEDRT
NIYMDVNGNN FTIKDVKIID SQGYHMYLLG TNHLIKNVKC LTARIRTDGV AVGAGNVTIT
NCFWYVSDNG FTYSGGYGYH RISNCIMGTT CAAFFPQHTL PYDVEFTDIY VFRADEGIIN
NWYNGAKIQS VVKNVTFNNL DCVDVINTPW IFSGKNMGDA VKNFYFNNCR FNAIRGSSIV
TEWNTKAGQA IHIINNDTLL HTSNYKLNFK NCYIDGKLIT SESDFKPQYK DSNELTISIN
NDGTKPEYPL YCVRNKVNYT YNKKVYINHN LQKLQHQPIG DGSEILLPET EICNLLDIKI
NANTKGTTQN GIKYISLDEI NRYYTKAVYD SEKSAVILSP VVDSSKNLLK DYSFACRYNP
YSNPGATLKP YVDNGEVVLR CIVTSNIYNQ GLYTVVTDEL EKYGAGVYTI SFEARSYNGN
KTTVEVRPHY VRYEGYSLIE KNPKKSVTVN GQWQRYEVTF DISDWDLSVG ASAIIRICSD
NTPGYDVLFK NIALTKVVPE VKKGDIVLDG NINSLDMMKL KKYLIRETQF NYDELLRADV
NSDGEVNSTD YAYLKRYILR IIDAFPQ