Gene Cthe_3056 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_3056 
Symbol 
ID4811128 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp3591687 
End bp3593117 
Gene Length1431 bp 
Protein Length476 aa 
Translation table11 
GC content33% 
IMG OID640108477 
Producttransposase, IS204/IS1001/IS1096/IS1165 
Protein accessionYP_001039445 
Protein GI125975535 
COG category[L] Replication, recombination and repair 
COG ID[COG3464] Transposase and inactivated derivatives 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones17 
Plasmid unclonability p-value0.905986 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGGATAAGT TTATTAAACA ATTAGATCCA AACTTAGACT ATATTAATCA TGAAATAAAT 
GATGGCAAAT GCTATATAAC AGTAGCTTCC AACCGCAAAG AAGTAACATG TCCATTTTGC
GGCCAGTCAT CATCCAGAAT ACATTCCACC TACAACAGAA TTTTTCAGGA CCTTCCAATA
CAAGGTAATA AGGTATTTAT TATTATACGT AACAGAAAAA TGTTTTGTGA TAATCATGAT
TGTAGTCATA CTACTTTTGC AGAAAGATTT GATTTTATCT CCTATAAAGC GAAGAAAACC
CGTCGTCTTG AGGATGAAAT TGTACGACTG TCAATAAATT GCAGTTCCGT TGCAGCATCA
AAAGCTCTAA AGGAAAATGT TGTGGATATC GGTAAAAGTA CAGTTTGCAA TCTCTTAAAA
AAAGAAACAC TGGTTGTTGA CAAAAAGACA GTAACAGTTG TTTGCATTGA TGATTTTGCT
ATTAAAAAGC GAAAAAGCTA TGGGACAATT ATGATAGATA TTTTTACGCA TCAAATACTT
GATATGATTG ACTCAAGGGA TTATGAGACT GTTTGCGAGT GGTTAAAAAC ATATCCAAAT
CTTAGTGTGA TATCAAGAGA TGGATCTGTT ATCTATAATA ATGCAATTGC AAATGCACAC
CCGGAAGCTT TACAAATAAG TGACCGTTTT CATTTACTGA AGAATCTGAC TTCCTATATA
ACAGAGTATC TAAAAAAGAG ATTAAAGCCG CAAGTTTTAA TACAAGCTGT CAGTCAGGAA
ACTAAAGAGA TAAAAACAAT AAGACAGGCA GATGAAAACA GAAAACTTAC ATTGAAAGAA
AAATATGAAA AGATAAAACA ACTCCTATTA GAAGGAAAAT GTAAAACAGA AATTTGCCGA
AGCTTAAATA TGGATATACG AGCTTATGAT AAGCTAATGG CAATGACGGC TGAAAAAAGG
GAAAAGTCAT TCCAGACAAA AAAGATGATC ATACATGAAG AAAGAGTAAA GCAAAAAATG
GAACGTGTAA ATGAGGTGCG GGAGTTAAAG GGAATAGGTT TGAGTAATAG AGAGATATCC
AGGCGTACTG GACTTAATAG AAAAACAGTT AGTAGATATC TTGATGAAAA CTTTAATCCG
GTCCATGCTG CCTATGGCAA AAAGAGAAAT GGGAAGCTGA CACCATATAT AAAAGCGATT
GACGAATACC TTGAGAAAGG GATTATGGGT TCATATATTG AGGAAAAGAT ACGCGAAATG
GGATATGAGG GTTCATCATC AACTGTGCGG GATTATATAA CAGACTGGAA GAAGCGGAGA
AAAAAATATT ACGATAAAAG TAGGGAAGAT GGGACAAAAA CAGAAACAAT AAAAAGAGAA
AATATATTAA AGCTATTGTA CCAACCAATA GAAAAAAGTA AAAATAATTA G
 
Protein sequence
MDKFIKQLDP NLDYINHEIN DGKCYITVAS NRKEVTCPFC GQSSSRIHST YNRIFQDLPI 
QGNKVFIIIR NRKMFCDNHD CSHTTFAERF DFISYKAKKT RRLEDEIVRL SINCSSVAAS
KALKENVVDI GKSTVCNLLK KETLVVDKKT VTVVCIDDFA IKKRKSYGTI MIDIFTHQIL
DMIDSRDYET VCEWLKTYPN LSVISRDGSV IYNNAIANAH PEALQISDRF HLLKNLTSYI
TEYLKKRLKP QVLIQAVSQE TKEIKTIRQA DENRKLTLKE KYEKIKQLLL EGKCKTEICR
SLNMDIRAYD KLMAMTAEKR EKSFQTKKMI IHEERVKQKM ERVNEVRELK GIGLSNREIS
RRTGLNRKTV SRYLDENFNP VHAAYGKKRN GKLTPYIKAI DEYLEKGIMG SYIEEKIREM
GYEGSSSTVR DYITDWKKRR KKYYDKSRED GTKTETIKRE NILKLLYQPI EKSKNN