Gene Cthe_2801 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_2801 
Symbol 
ID4810118 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp3301150 
End bp3303162 
Gene Length2013 bp 
Protein Length670 aa 
Translation table11 
GC content47% 
IMG OID640108221 
Productcarbon-monoxide dehydrogenase, catalytic subunit 
Protein accessionYP_001039193 
Protein GI125975283 
COG category[C] Energy production and conversion 
COG ID[COG1151] 6Fe-6S prismane cluster-containing protein 
TIGRFAM ID[TIGR01702] carbon-monoxide dehydrogenase, catalytic subunit 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.00000137875 
Plasmid hitchhikingNo 
Plasmid clonabilityunclonable 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAATTTTT ATAAAGATAA GGAAATAAGA TACCACCACA GGCATTTTCA TAACGAAGGT 
GGGCACCACC ATCATCAGGA TTCCTTTGAT GATTACAACA ATGCGGTTAA TGAATACAAG
AAAAGCTTTC CGTCAAAAGC AAATGTTATA GAGAACACTC CGGACCCTGC GGTAAGAAAA
ATGCTTGTAC ATATGGAAAA GCAGGGCTGT GAGACATGTT TTGACCGCTT TGACAGTCAG
AAACCCCACT GTAATTTCGG GCTTGCGGGT GTCTGCTGCA AAAACTGCAA CATGGGGCCG
TGCAGAATAA CGAAGAAAAG CCCCAGAGGA GTGTGCGGAG CGGATGCCGA CCTCATTGTT
GCAAGAAATC TTCTAAGGTG GGTGGCGGCA GGTGTTGCAG CCCATGGAGC AAGGGGCCGC
GAAATAATGC TGGCACTGAA AGCGGCCGGG GAAGGAATAC TTGACATGCC TGTTGCAGGA
GAAGCAAAGC TTAGAAAATC CGCCGCCCAA CTTGGCATAT CCACCGAGGG CAAGACCAGG
GAAGAGTTGG CGGTGGAAGT TGCAGACATT CTTCTTGAGG ATTTGTCAAG GACGGTTCCG
GGAGAGCACA AGACATTAAA TGCTTTTGCA ACCAAAGAAA GAATTGAAAA GTGGCGCGAG
CTAGACATTC TTCCCATAGG GGCCTATCAT GAAGTGTTTG AAGCCCTCCA TCGGACCAGT
ACGGGAACGG ACGGAGATTG GAAAAACATT ATGAAGCAAT TTTTAAGATG CGGGCTGGCT
TTTGCCTGGA GCAGCGTTTT AGGCTCTTCA ATAGCCATGG ACAGTTTGTT TGGTTTGCCC
GTAAGAAGCA CCGTTAAAGC AAATTTGGGT GCCCTTAGGG AAGGTTATGT TAATATTGCC
GTTCACGGTC ATTCCCCTCT TTTGGTCAGT GAAATAGTAA AGCAGGGAAG AAGCCGGGAA
TTTATACAAA TGGCAAAAGA AAAAGGAGCC TTGGGAATAC AGTTCTATGG AATATGCTGC
TCGGGACTTT CGGCAATGTA CCGTTATGGG GGAGTTATTC CTCTTTCCAA CGCAATTGGT
GCGGAGCTGG TTCTTGGCAC CGGGGCCATT GATTTGTGGG TGGCGGATGT CCAGGATGTA
TTCCCGTCAA TAATGGATGT TGCTAAATGC TTTAAAACCA CGGTTGTTAC AACCAGCGAC
TCTGCAAGAC TTCCCGGAGC GGAGCATTAC GCCTATGACC ATCACCATTC AAACCTGGCC
CAGACGCAGG AATTGGCAAA AACCATAGTT AAGAGGGCTA TTGAAAGCTT TGAGGCAAGA
AGGGACGTTC CGGTCTTTAT TCCAAATTAT GAGGTGGATG CGGAGATCGG TTTTTCCGTA
GAGTATGCCA CAAGCCGTTT TGGAAGCATG GATGTGATTG CGAAGGCTCT GCAGGAAGGC
AAAATCCGCG GTGTTGTAAA CCTTGTGGGC TGCAACAATC CGAGAGTTAT GTATGAAAAA
GCAATAGCAG ATGTGGCAAG AAAGCTTATT GAAAACAACA TTCTTGTGCT TACCAACGGT
TGTGCGTCCT TTCCCCTTTT GAAGCTTGGC TATTGCAATG TTAAAGCATT GGAATGGACA
GGTAAGGAGC TTAGGGAATT TTTGGAGCCG GATTTGCCTC CGGTGTGGCA TATGGGCGAA
TGTCTTGACA ATGCAAGGGC ATCAGCCTTT TTCAGAGCAT TGGCGGACAG CCTGAAGAAA
GATATAAAAG ACATGCCTTT TGCGTTTGCA AGTCCCGAAT GGTCCAATGA AAAGGGTGTC
GGGGCGGCCC TTGGATTCAG GCTTTTAGGT ATAAACTCCT ATCATTCGGT TTATCCGCCT
GTTCAAGGTT CTAAAAATGT AATGAAATAT CTGTTTGAAG ATACGGAAAA AACCCTGGGA
GCTGTCATGA TAGTGGAAGT GGATCCGCTG AAGCTCGCAG ACAGAATAAT TTCAGACATC
GATGAAAAGA GAAAGGCTTT GATGTGGAAA TGA
 
Protein sequence
MNFYKDKEIR YHHRHFHNEG GHHHHQDSFD DYNNAVNEYK KSFPSKANVI ENTPDPAVRK 
MLVHMEKQGC ETCFDRFDSQ KPHCNFGLAG VCCKNCNMGP CRITKKSPRG VCGADADLIV
ARNLLRWVAA GVAAHGARGR EIMLALKAAG EGILDMPVAG EAKLRKSAAQ LGISTEGKTR
EELAVEVADI LLEDLSRTVP GEHKTLNAFA TKERIEKWRE LDILPIGAYH EVFEALHRTS
TGTDGDWKNI MKQFLRCGLA FAWSSVLGSS IAMDSLFGLP VRSTVKANLG ALREGYVNIA
VHGHSPLLVS EIVKQGRSRE FIQMAKEKGA LGIQFYGICC SGLSAMYRYG GVIPLSNAIG
AELVLGTGAI DLWVADVQDV FPSIMDVAKC FKTTVVTTSD SARLPGAEHY AYDHHHSNLA
QTQELAKTIV KRAIESFEAR RDVPVFIPNY EVDAEIGFSV EYATSRFGSM DVIAKALQEG
KIRGVVNLVG CNNPRVMYEK AIADVARKLI ENNILVLTNG CASFPLLKLG YCNVKALEWT
GKELREFLEP DLPPVWHMGE CLDNARASAF FRALADSLKK DIKDMPFAFA SPEWSNEKGV
GAALGFRLLG INSYHSVYPP VQGSKNVMKY LFEDTEKTLG AVMIVEVDPL KLADRIISDI
DEKRKALMWK