Gene Cthe_3003 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_3003 
Symbol 
ID4811151 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp3525027 
End bp3526961 
Gene Length1935 bp 
Protein Length644 aa 
Translation table11 
GC content46% 
IMG OID640108424 
Producthydrogenase, Fe-only 
Protein accessionYP_001039392 
Protein GI125975482 
COG category[R] General function prediction only 
COG ID[COG4624] Iron only hydrogenase large subunit, C-terminal domain 
TIGRFAM ID[TIGR02512] hydrogenases, Fe-only 


Plasmid Coverage information

Num covering plasmid clones14 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGGATAGTT TCCTGATGAA AGGTTATATC AAAGAGGCAA ACATAGATTA CAGCTGCAGC 
CGTGGTTCCA TGGAAGACTT ACCAAAGTGG GAATTCCGCG AAATACCGAA AGTTCCTCGT
GCGGTAATGC CTTCCCTTTC TCTGGAAGAG CGTAAAAACA ATTTTAACGA GGTTGAATTG
GGTCTTTCTG AAGAAGTTGC AAGAAAAGAA GCACGCCGCT GTTTAAAATG CGGATGCAGC
GCCCGTTTCA CCTGCGATTT GAGAAAAGAA GCAAGCAATC ATGGCATTGT ATATGAAGAA
CCAATTCACG ACCGCCCATA TATCCCGAAA GTTGATGACC ATCCTTTCAT CGTAAGGGAT
CACAACAAGT GTATCTCCTG CGGCCGTTGT ATTGCCGCAT GTGCTGAAAT TGAAGGCCCG
GGCGTTCTCA CCTTCTATAT GAAAAACGGT CGCCAGCTTG TAGGTACAAA AAGCGGTCTT
CCTCTCAGGG ACACAGACTG CGTAAGCTGC GGTCAGTGTG TTACTGCATG TCCTTGCGCC
GCTCTCGATT ATCGCCGTGA GAGAGGAAAG GTGGTAAGGG CAATCAACGA TCCCAAAAAG
ACTGTTGTCG GATTTGTCGC ACCGGCAGTT CGCAGTCTTA TATCCAACAC CTTCGGTGTC
TCATATGAGG AAGCATCTCC GTTTATGGCA GGCCTGCTTA AAAAGCTTGG CTTTGACAAG
GTGTTTGACT TTACCTTTGC CGCAGACCTG ACTATTGTGG AAGAAACCAC GGAGTTCCTC
TCAAGGATAC AAAACAAGGG TGTAATGCCT CAATTTACAT CCTGTTGTCC GGGATGGATC
AATTTCGTTG AAAAGAGATA CCCCGAAATA ATCCCTCACC TGTCCACCTG CAAATCACCG
CAGATGATGA TGGGTGCAAC AGTTAAAAAT CACTATGCAA AACTTATGGG AATAAATAAA
GAGGATCTTT TCGTGGTATC CATAGTTCCA TGTCTTGCGA AGAAATATGA AGCTGCCCGT
CCGGAGTTTA TCCACGACGG CATCCGCGAT GTGGACGCAG TTCTTACAAC TACGGAAATG
CTTGAAATGA TGGAACTTGC CGATATCAAA CCTTCGGAAG TGGTTCCTCA GGAATTTGAC
GAGCCTTACA AACAGGTTTC CGGTGCCGGA ATACTGTTTG GTGCTTCCGG CGGTGTGGCT
GAAGCCGCCC TTCGTATGGC CGTTGAGAAA CTTACAGGAA AAGTGCTCAC CGACCACCTT
GAATTTGAAG AAATTCGCGG CTTTGAAGGT GTGAAAGAAT CCACCATAGA CGTAAACGGC
ACAAAAGTGC GCGTGGCAGT TGTCAGCGGC CTTAAGAATG CTGAACCTAT CATTGAAAAA
ATACTCAACG GGGTTGACGT GGGATACGAC CTCATAGAAG TAATGGCATG CCCCGGTGGA
TGTATCTGCG GAGCCGGACA TCCGGTCCCC GAAAAAATAG ATTCTCTTGA AAAGCGCCAG
CAAGTTCTTG TAAATATAGA CAAAGTTTCG AAATACAGAA AATCCCAGGA GAATCCGGAT
ATCTTAAGGC TGTACAATGA ATTTTACGGT GAACCGAATT CTCCTCTGGC TCACGAACTT
TTGCACACAC ATTATACTCC AAAGCACGGT GACAGTACCT GCAGCCCTGA ACGTAAAAAA
GGAACGGCAG CATTTGATGT GCAAGAGTTT ACAATCTGCA TGTGCGAATC CTGCATGGAG
AAGGGCGCTG AAAATCTCTA CAACGATTTA AGCTCAAAAA TTAGACTGTT CAAAATGGAT
CCGTTTGTGC AAATAAAGAG AATTCGATTA AAAGAAACCC ATCCGGGCAA AGGCGTGTAT
ATCGCCCTTA ACGGAAAACA AATTGAGGAG CCTATGCTCA GCGGTAATAT CCCGGACGAA
TCAGAATCAG AATAA
 
Protein sequence
MDSFLMKGYI KEANIDYSCS RGSMEDLPKW EFREIPKVPR AVMPSLSLEE RKNNFNEVEL 
GLSEEVARKE ARRCLKCGCS ARFTCDLRKE ASNHGIVYEE PIHDRPYIPK VDDHPFIVRD
HNKCISCGRC IAACAEIEGP GVLTFYMKNG RQLVGTKSGL PLRDTDCVSC GQCVTACPCA
ALDYRRERGK VVRAINDPKK TVVGFVAPAV RSLISNTFGV SYEEASPFMA GLLKKLGFDK
VFDFTFAADL TIVEETTEFL SRIQNKGVMP QFTSCCPGWI NFVEKRYPEI IPHLSTCKSP
QMMMGATVKN HYAKLMGINK EDLFVVSIVP CLAKKYEAAR PEFIHDGIRD VDAVLTTTEM
LEMMELADIK PSEVVPQEFD EPYKQVSGAG ILFGASGGVA EAALRMAVEK LTGKVLTDHL
EFEEIRGFEG VKESTIDVNG TKVRVAVVSG LKNAEPIIEK ILNGVDVGYD LIEVMACPGG
CICGAGHPVP EKIDSLEKRQ QVLVNIDKVS KYRKSQENPD ILRLYNEFYG EPNSPLAHEL
LHTHYTPKHG DSTCSPERKK GTAAFDVQEF TICMCESCME KGAENLYNDL SSKIRLFKMD
PFVQIKRIRL KETHPGKGVY IALNGKQIEE PMLSGNIPDE SESE