Gene Cthe_3020 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_3020 
Symbol 
ID4811168 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp3543236 
End bp3544315 
Gene Length1080 bp 
Protein Length359 aa 
Translation table11 
GC content44% 
IMG OID640108441 
Productech hydrogenase subunit E 
Protein accessionYP_001039409 
Protein GI125975499 
COG category[C] Energy production and conversion 
COG ID[COG3261] Ni,Fe-hydrogenase III large subunit 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones24 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGGGTAAGA AAACAGTAAT CCCCTTCGGC CCTCAACATC CGGTTTTACC GGAGCCCATA 
CATTTAGATC TTGTGCTTGA GGATGAAACA GTGGTAGAGG CAATACCTTC TATTGGATAT
ATACACCGCG GTCTGGAAAA ACTTGTGGAA AAAAAGGACT ATCAGCAGTT TGTTTATGTA
GCGGAAAGAA TTTGCGGCAT TTGCTCTTTC ATGCACGGCA TGGGTTACTG CATGTCGATT
GAAAACATAA TGGGAGTGCA AATTCCTGAA AGAGCAGAGT TTTTAAGAAC CATCTGGGCA
GAGCTGTCAC GCATACACAG CCACATGCTT TGGTTGGGGC TTTTAGCCGA CGCCCTTGGA
TTTGAAAGCC TGTTTATGCA TTCTTGGAGG CTAAGAGAGC AGATTCTTGA CATATTCGAA
GAAACCACCG GAGGAAGAGT AATATTCTCC GTCTGCGATA TTGGCGGTGT AAGAAGAGAT
ATAGATTCTG AAATGCTGAA AAAAATAAAC TCAATATTGG ATGGTTTTGA AAAAGAATTT
TCAGAAATCA CAAAAGTATT TTTGAATGAT TCTTCCGTAA AACTTCGTAC CCAAGGCCTT
GGTGTGCTTT CCCGTGAAGA GGCTTTTGAA CTGGGAGCAG TCGGGCCTAT GGCGAGAGCC
AGCGGTATCG ATATTGACAT GAGAAAAAGC GGCTATGCCG CATACGGAAA ATTAAAGATA
GAACCCGTTG TTGAAACCGC CGGAGATTGC TATGCCAGAA CATCGGTAAG AATCAGAGAA
GTTTTTCAAT CCATTGACCT GATTCGCCAG TGCATATCCC TCATTCCTGA CGGTGAAATC
AAGGTAAAGA TTGTGGGAAA TCCAAGCGGT GAATACTTTA CCCGCCTGGA GCAGCCCCGC
GGAGAAGTTT TATATTATGT AAAGGCAAAC GGAACAAAGT TTCTGGAAAG ATTCAGGGTT
CGTACTCCAA CCTTTGCAAA TATTCCGGCT CTGCTTCACA CGCTGAAAGG ATGTCAGCTT
GCAGACGTCC CGGTATTGAT TCTGACCATT GACCCTTGCA TAAGCTGTAC CGAAAGATAA
 
Protein sequence
MGKKTVIPFG PQHPVLPEPI HLDLVLEDET VVEAIPSIGY IHRGLEKLVE KKDYQQFVYV 
AERICGICSF MHGMGYCMSI ENIMGVQIPE RAEFLRTIWA ELSRIHSHML WLGLLADALG
FESLFMHSWR LREQILDIFE ETTGGRVIFS VCDIGGVRRD IDSEMLKKIN SILDGFEKEF
SEITKVFLND SSVKLRTQGL GVLSREEAFE LGAVGPMARA SGIDIDMRKS GYAAYGKLKI
EPVVETAGDC YARTSVRIRE VFQSIDLIRQ CISLIPDGEI KVKIVGNPSG EYFTRLEQPR
GEVLYYVKAN GTKFLERFRV RTPTFANIPA LLHTLKGCQL ADVPVLILTI DPCISCTER