Gene Cthe_0239 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_0239 
Symbol 
ID4808587 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp287571 
End bp290726 
Gene Length3156 bp 
Protein Length1051 aa 
Translation table11 
GC content44% 
IMG OID640105651 
Productcellulosome enzyme, dockerin type I 
Protein accessionYP_001036671 
Protein GI125972761 
COG category 
COG ID 
TIGRFAM ID[TIGR02543] Listeria/Bacterioides repeat 


Plasmid Coverage information

Num covering plasmid clones14 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAAAAAAG TCATATCAGG TGCAGTCGTT ATTGCACTGC TTTTTGCAAT AATGATACCT 
GCGCGTTACT TTGGAGCTTT TGCCGAGGCG AGCCAGACCC TTTTTATTAA TGAGGTCATG
TCGTCAAACG TGTTCACAAT TCGGGACGGA GATGTTACAG ACCCTAAACA TGGAAGTAAG
GGCGGAGCAT ATTCGGACTG GATTGAAATC TATAATGCCG GCCCTTATGA TGTGGATTTG
ACAGGATACA TTCTTGCCGA CTCTTCGGCA GAGTGGGTAT TCCCGCAGGG GATTGTGCCG
GCAGGAGGGT ATCTTTTGGT TTGGGCGTCT GACAAGAATA TGGTGGCTCA AGACGGCCAG
TTGCACACAA ACTTCAAGCT AAGTGCATCC GGCGAGAACA TAACATTAAA AAAGCCTGAC
GGAACGATTG TTGATTCAGT TGATATTATC GGCCTTGGGG ATGACCAGAG CTATGGAAGA
AAAAGTGACG GTGCGTCAGA GTTTGTTGTT TTCATAAATC CAACCCCCGG TGCCGCAAAT
GTTTATAACG CTCCTGGTAC ATCCACCCTC TTTATCAATG AGGTTATGGC TTCCAATACA
CGTACTATAA GGGACGGAGA TGTGGATGAC CCTAAAGACG GAAGCAAGGG CGGAGCATAT
TCGGATTGGA TTGAAATTTA TAATGCCGGC CCTTATGATG TGGACTTGAC AGGATACATT
CTTGCCGACT CTTCGGCAGA GTGGGTATTC CCGCAGGGGA TTGTGCCGGC AGGAGGGTAC
CTCTTGGTTT GGGCATCTGA CAAGAACAAG GTTGCTACTG ACGGCGGGCT GCATACCAAT
TTCAAAATAA GCTCTTCCGG AGAAGCATTG ACCTTAAAGG ACCCTGACGG CAATGTTATT
GATATTTTGG TGACTATTAA TCTTTTGGAT GACCAAGCCT ATGGAAGAAA AACCGACGGT
TCCTCTGAAC TGGTGTTGTT AAAACCTACA CCCGGAACTG CAAATATTTA TGATCCAAGT
CTTATACCTG TTTCAGAACC TGTTTTTTCT CATCAGGGAG GTTTTTATAC CGGGGCGTTT
AAGCTGGAGC TTACCACAAA TGAGCCCGGA GTTAAGATTT ACTATACCAC GGACGGTTCT
GACCCTGTGC CGGGGAAATC GGGAACCATT GAGTATACTT CCGGCATTAA TATAAAGAGC
AGGAAAGGAG AAGCGAATGT ACTTTCCATG ATTCAGGATA TATCCAACGA CCAGTGGAAC
AGATGGAGAG CACCAAACGG GGAAGTTTTC AAATGTACCA CCATCAAGGC CGTTGCCATA
AGAGACGACG GGGCACGGAG TAAAGTAGTA ACCCACTCAT ATTTTGTAGA TCCTCAAATG
AATACCAGAT ATACACTGCC CGTAATTTCT ATTGTAACCG ACTATGACAA CTTCTTTGAC
AAGTCTACCG GGATTTATGT AAATGGGAAT TATGAGAACA GGGGAAAAGA GTGGGAAAGA
CCGGTACATA TCGAGTACTT TGAAACAGAC GGGAAACTTG GATTCTCCAT GGATATGGGA
CTTAGGATAC ATGGGGGATA CACAAGAAAG TATCCTCAGA AGTCTTTCCG TTTGTATGCT
GACCACAATA ATGATATTGG CGAGATCAAA TATGAGATTT TCCCGGGACT TAGAGGAACG
GGAACAGGTA AAAAAATAAA GAGTTTTGAA CGCCTGATTT TGAGGAATGC GGGAAATGAC
TGGACAGGTG CTCTTTTCAG GGATGAAATG ATGCAAAGTC TCGTTTCCCA CCTGAAAATA
GACACCCAGG CTTTCAGACC GTGTATTGTA TTCCTGAATG GAGAGTATTG GGGAATATAT
CATATTCGTG AACGTTATGA CGATAAATAT CTTAAATCAC ATTATGGCCT TGATGATGAC
AAAGTTGCAA TACTTGACGT TTACCAGACT CCCGAAGTCC AGGAAGGCGA TTCCTCGGAT
GTCCTGGCAT ATACAAATGA TGTAATAAAT TATTTAAAAA CCCATTCCAT AACTGAAAAA
AGCACATATG ATTATATTAA AACAAAGATC GACATAGAAA ACTATATCGA TTATTATGTA
GCTCAGATAT TCTTCGGAAA TACGGACTGG CCCGGAAACA ATGTAAGCAT ATGGAGATAC
AAGACGGATG ACGGCCAATA TCATCCGGAA GCTCCTTACG GTCAGGACGG AAGATGGAGG
TGGATGCTGA AAGATACTGA TTTTGGCTTT GGTTTGTACG GAAAAAGTCC GTCACACAAC
ACCCTTGCTT TTGCAGCCGG TGATATACGC GAAGGTCAAG CCAATGAGGA GTGGGCGGTA
TTTCTTTTCA AGACTCTTCT TAAAAATGAA GAGTTCAGGA ATGAGTTTAT AAACCGTTTT
GCCGACCAGT TGAATACTTC GTTTGTACCG TCAAGAGTGA TTTCAATTAT TGATGATATT
GTTGCAACTT TGGAACCGGA GATGAAAGAG CATACGGACC GTTGGCCGTT TATTAAATTG
ACAGCCACCA GCCCGTGGGA TACAACCTGG AGCCAGGAAG TAAACAGAAT CAGAAATTAT
GCCAACAGCC GTCCGTCATA TGTAAGGCAG CATATATTGA GCAAGTTCCG CAATAATGGT
GTAACGGGTA CTGCTCTTGT TACTTTGAAC ACTGATTCGA CCCGGGGACA CATAAGGATT
AATTCCATTG ACATAGTATC AGATACTCCG GCAGTTACAA ATCCTAATCG CTGGAGCGGT
ACTTACTTCA AGGGAGTTCC CATAACGCTA AAGGCGATAC CAAAAGAAGG CTATGTGTTT
GACCACTGGG AGGGTATAAA CGGATCCGTT GAGGCATCAT CGGATACGAT AACAGTCAAC
CTTTCGAATG ATTTGAACGT TACAGCGGTA TTCAGGCCTG AAAATGAAAC TCCCGATCCT
GAAATTTTGT ATGGTGACTA TAATGGGGAT GGAGCGGTTA ACTCCACAGA CTTGTTGGCA
TGTAAAAGGT ATCTGCTTTA TGCTTTGAAA CCGGAGCAGA TAAATGTTAT TGCAGGGGAT
CTTGACGGCA ATGGAAAGAT TAACTCGACT GACTATGCAT ATCTTAAGAG ATATTTGTTA
AAGCAGATAG ATAAGTTCCC GGTACAGTTG AAGTAA
 
Protein sequence
MKKVISGAVV IALLFAIMIP ARYFGAFAEA SQTLFINEVM SSNVFTIRDG DVTDPKHGSK 
GGAYSDWIEI YNAGPYDVDL TGYILADSSA EWVFPQGIVP AGGYLLVWAS DKNMVAQDGQ
LHTNFKLSAS GENITLKKPD GTIVDSVDII GLGDDQSYGR KSDGASEFVV FINPTPGAAN
VYNAPGTSTL FINEVMASNT RTIRDGDVDD PKDGSKGGAY SDWIEIYNAG PYDVDLTGYI
LADSSAEWVF PQGIVPAGGY LLVWASDKNK VATDGGLHTN FKISSSGEAL TLKDPDGNVI
DILVTINLLD DQAYGRKTDG SSELVLLKPT PGTANIYDPS LIPVSEPVFS HQGGFYTGAF
KLELTTNEPG VKIYYTTDGS DPVPGKSGTI EYTSGINIKS RKGEANVLSM IQDISNDQWN
RWRAPNGEVF KCTTIKAVAI RDDGARSKVV THSYFVDPQM NTRYTLPVIS IVTDYDNFFD
KSTGIYVNGN YENRGKEWER PVHIEYFETD GKLGFSMDMG LRIHGGYTRK YPQKSFRLYA
DHNNDIGEIK YEIFPGLRGT GTGKKIKSFE RLILRNAGND WTGALFRDEM MQSLVSHLKI
DTQAFRPCIV FLNGEYWGIY HIRERYDDKY LKSHYGLDDD KVAILDVYQT PEVQEGDSSD
VLAYTNDVIN YLKTHSITEK STYDYIKTKI DIENYIDYYV AQIFFGNTDW PGNNVSIWRY
KTDDGQYHPE APYGQDGRWR WMLKDTDFGF GLYGKSPSHN TLAFAAGDIR EGQANEEWAV
FLFKTLLKNE EFRNEFINRF ADQLNTSFVP SRVISIIDDI VATLEPEMKE HTDRWPFIKL
TATSPWDTTW SQEVNRIRNY ANSRPSYVRQ HILSKFRNNG VTGTALVTLN TDSTRGHIRI
NSIDIVSDTP AVTNPNRWSG TYFKGVPITL KAIPKEGYVF DHWEGINGSV EASSDTITVN
LSNDLNVTAV FRPENETPDP EILYGDYNGD GAVNSTDLLA CKRYLLYALK PEQINVIAGD
LDGNGKINST DYAYLKRYLL KQIDKFPVQL K