Gene Cthe_2024 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_2024 
Symbol 
ID4810994 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2402349 
End bp2403539 
Gene Length1191 bp 
Protein Length396 aa 
Translation table11 
GC content36% 
IMG OID640107433 
Producttransposase IS116/IS110/IS902 
Protein accessionYP_001038428 
Protein GI125974518 
COG category[L] Replication, recombination and repair 
COG ID[COG3547] Transposase and inactivated derivatives 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.0000259984 
Plasmid hitchhikingYes 
Plasmid clonabilityhitchhiker 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGCACAAGG AAAAGCATTG TGCAGTTGTG ATTGATTGCT GGATGGAAAA AATCGGAGAA 
GTCAACTTTG AAAACAAACC ATCCAAGTTT CCAGCATTTG TTGAGGAGAT TAGGAAAATA
TGCGGTACAA AGGACTTTGT ATTTGGGCTT GAGGATACCA GAGGTTTTGG CCGTAATCTG
GCTTCATATC TTACAGGAAG AAAGTTTGAA GTCAAACATG TGAATCCAGT ATATACCAGT
GCAATAAGAC TTTCAAATCC TATTATATAC AAGGATGATT CTTATGATGC CTATTGTGTT
GCAAGGGTAT TAAGGGATAT GGTGGATACA TTACAGGATG CAAAACATGA GGATATATAT
TGGGCAATCA GACAGTTGGT AAAAAGAAGA GAAATAATAG TCAAATATAA TGTTATGAAC
AAAAATCAAT TACATAGTCA GTTGTCTTAT GGTTACCCAT CCTATAAGAA GTTCTTTTCA
CAAATTGATG GAAAAAGTGC ATTGTGTTTC TGGGAGAACT ATCCTTCACC AGAGCATATT
TGGTGTACTA CACCAGAACA AATTTATGAA ACAATAAAAG CAGTACATCA GGCATTCAAG
ATAGAGCGTG TTCATGCAAT TATTGATATG ATTAAAAAGG ATGGAAATAC ACAGAAGGGG
TATCAGGAAG AAAGAGATAC AATAGTAAGA AACATCGTAA AAGATATTAA GAACAATCAA
GAGCTATTAA AAGACATAGA AAAGCAGTTA AGAAAATTAT TGCCACAGAC AGGGTATAAG
CTGCAAACTA TGCCAGGGAT AGACCTTATC ACAGAAGCAA AGATTGTGTC TGAAATTGGT
GATATTAACA GATTTCCAAA TTCAGATAAG TTAGCTCGTT TTATGGGGTT AGCGCCTGTA
CATTTTAGTT CAGCAGGTAA AGGTAAGGAA GAAAGGTGTA GAAATGGAAA CAGAGAGTTA
AATGCTATAT TTCATTTTTT GGCTATACAA ATGGTAGCTG TATCACCTTC AGGAAAGCCA
AGGCATCCAG TATTCAGAGA GTATTTTGAA CAGAAGGTCA AAGAGGGTAA AAACAAGCCA
CAGGCTCTTA TATGTATATC AAGAACGCTT GTAAGATTAA TCTACGGTAT GATGAAGACA
AAGACAGAGT ATAGACCGTA TGAGAAGAAA GAAGAAGAAG GAAATAATTA G
 
Protein sequence
MHKEKHCAVV IDCWMEKIGE VNFENKPSKF PAFVEEIRKI CGTKDFVFGL EDTRGFGRNL 
ASYLTGRKFE VKHVNPVYTS AIRLSNPIIY KDDSYDAYCV ARVLRDMVDT LQDAKHEDIY
WAIRQLVKRR EIIVKYNVMN KNQLHSQLSY GYPSYKKFFS QIDGKSALCF WENYPSPEHI
WCTTPEQIYE TIKAVHQAFK IERVHAIIDM IKKDGNTQKG YQEERDTIVR NIVKDIKNNQ
ELLKDIEKQL RKLLPQTGYK LQTMPGIDLI TEAKIVSEIG DINRFPNSDK LARFMGLAPV
HFSSAGKGKE ERCRNGNREL NAIFHFLAIQ MVAVSPSGKP RHPVFREYFE QKVKEGKNKP
QALICISRTL VRLIYGMMKT KTEYRPYEKK EEEGNN