Gene Cthe_3079 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_3079 
Symbol 
ID4809953 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp3632842 
End bp3634908 
Gene Length2067 bp 
Protein Length688 aa 
Translation table11 
GC content42% 
IMG OID640108503 
Productcellulosome anchoring protein, cohesin region 
Protein accessionYP_001039468 
Protein GI125975558 
COG category 
COG ID 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones27 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAAAAAAA ACAATGTATT AACAATAGCA GCTATGATAG CGCTTCTTCT AACCAGCTTA 
CTTACAAGTA TAACTTTTGG GGAGACTTCG AGTATACCTT CAAGAATATC TATGGAGCTT
GACAAGACAA AAGCAAACAT AGGCGACATA ATTATAGCCA CAATAAGAAT TGACAATATC
AATAACTTTA GCGGATATCA ATTAAATATA AAGTATGATC CGTCATACCT CCAGGCAGTT
AATCCTTTGA CAGGAGAACC GATAAAAAAG AGAACAATGC CGGCAGTGAA CGGCACGGTG
TTGTTAAAGG GAGATCAGTA CAGTATTACT GAGGTTGTAG AAAATAACGT CGATGAAGGG
ATTTTAAATT TTGGCAAGGG ATATGCAAAT TTAACTGAAT ACAGGAAAAG CGGAAAACCT
GAAACAACCG GAATTATTGG CAAGATAGGA TTTAAAGCCT TAAAGCTTGG CAAGACGGAG
ATCAAATTTG AGAACACACC CGTCATGCCT GGGGCAAAAG AAGGAACACT GCTGTTTGAC
TGGGATGCAG AAACTATAAC GGAATATAAT GTAATTCAGC CTAAAGAACT TGCAATAACG
TTACCGGACG ATGCACACAT TGCTTTGGAA CTTGACAAGA CAAAAGTGAA AGTGGGAGAT
GTAATTGTTG CGACAGTAAA AGCAAAGAAT ATGACTAGTA TGGCGGGAAT TCAGGTAAAT
ATTAAATATG ACCCTGAAGT ATTGCAGGCG ATTGATCCTG CGACGGGAAA ACCGTTTACA
AAAGAAACAT TACTTGTGGA CCCGGAACTG TTATCAAACA GAGAATATAA TCCGTTGTTA
ACAGCAGTTA ATGACATAAA TTCCGGCATT ATAAATTATG CATCTTGTTA TGTATATTGG
GATTCCTACA GAGAATCAGG AGTATCTGAA AGCACCGGAA TAATTGGAAA GGTTGGCTTT
AAAGTGCTGA AAGCTGCCAA CACCACAGTA AAACTGGAAG AAACAAGATT TACACCAAAT
TCGATAGACG GTACTTTGGT AATTGATTGG TATGGCCAAC AGATAGTTGG TTATAAAGTA
ATACAGCCCG ACAAAATTAC TGTGATTTCA GAGCCTGAGG TACCAACACA AACACCTACA
CAGACACCGC CAACAACAAC AGCACCATCG CAAACACCTA CGCAGACACC GCCAACAACA
ACAGCACCAT CACAGACACC TACACAGACA CCGGCAGTAA CGCCGACGCA AAGTGCAACT
CCGTCGGATC CTGGCGGAGG TGGAGGAGGC CTCCCGGGTG GTGGAGGCGG CGCTGTTAAT
CCTTCAGCTT CACCGACACC AACACCGACA TCCAAACCTA CTCCTACTGC CACTAAAAAA
CCGGAGCCAA CGGAAATAGA AGAACCCGAA CCTGAAATAC CGGGCACTGT TGGAATACAT
TATTCATACC TGACAGGTTA TCCGGACAAA ATGTTCAGAC CTGAAAAGAG TATTACAAGA
GCTGAAGCAG CCGTGATTTT TGCAAAACTT TTGGGAGCAA ACGAAAATAC AAAGATAAAC
TATAATGTTT CATACACCGA TGTTGACAGC TCCCATTGGG CAAGTTGGGC AATCAAATTT
GTATCATACA AGAAACTGTT TACCGGATAT CCTGATGGCT CGTTCAAGCC TAATCAGAAT
ATAACGAGAG CCGAATTTTC AACGGTTGTG TTTAAGCTTC TTGTATCTGA GAAAGGTCTA
AAAGAAGAAA AGATTGAAAA GTCCAAGTTT GGTGATACAA AGGGCCACTG GGCACAACAG
TTTATTGAAC AGCTGTCAGA CCTTGGATAC ATCAACGGAT ATCCTGATGG TACATTCAAG
CCCAACAACA ATATCAAACG ATCAGAAAGT GTTGCCCTGA TAAACAGAGC TATGGGAAGA
GGGCCTTTGC ATGGCGCACC GCAGGTATTC GAGGATGTTC CTCAGACACA CTGGGCTTTC
AAAGATATTG CAGAGGGCGT GCTCAATCAC AGATACAAAC TGGACAATGA GGGCAAAGAA
CAATTGCTGG AGATAATTGA TAACTAA
 
Protein sequence
MKKNNVLTIA AMIALLLTSL LTSITFGETS SIPSRISMEL DKTKANIGDI IIATIRIDNI 
NNFSGYQLNI KYDPSYLQAV NPLTGEPIKK RTMPAVNGTV LLKGDQYSIT EVVENNVDEG
ILNFGKGYAN LTEYRKSGKP ETTGIIGKIG FKALKLGKTE IKFENTPVMP GAKEGTLLFD
WDAETITEYN VIQPKELAIT LPDDAHIALE LDKTKVKVGD VIVATVKAKN MTSMAGIQVN
IKYDPEVLQA IDPATGKPFT KETLLVDPEL LSNREYNPLL TAVNDINSGI INYASCYVYW
DSYRESGVSE STGIIGKVGF KVLKAANTTV KLEETRFTPN SIDGTLVIDW YGQQIVGYKV
IQPDKITVIS EPEVPTQTPT QTPPTTTAPS QTPTQTPPTT TAPSQTPTQT PAVTPTQSAT
PSDPGGGGGG LPGGGGGAVN PSASPTPTPT SKPTPTATKK PEPTEIEEPE PEIPGTVGIH
YSYLTGYPDK MFRPEKSITR AEAAVIFAKL LGANENTKIN YNVSYTDVDS SHWASWAIKF
VSYKKLFTGY PDGSFKPNQN ITRAEFSTVV FKLLVSEKGL KEEKIEKSKF GDTKGHWAQQ
FIEQLSDLGY INGYPDGTFK PNNNIKRSES VALINRAMGR GPLHGAPQVF EDVPQTHWAF
KDIAEGVLNH RYKLDNEGKE QLLEIIDN