Gene Cthe_2171 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_2171 
Symbol 
ID4810884 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2582903 
End bp2585872 
Gene Length2970 bp 
Protein Length989 aa 
Translation table11 
GC content43% 
IMG OID640107574 
Producttype III restriction enzyme, res subunit 
Protein accessionYP_001038566 
Protein GI125974656 
COG category[K] Transcription
[L] Replication, recombination and repair 
COG ID[COG1061] DNA or RNA helicases of superfamily II 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.0533781 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAGCAATG TCCAATTTGA CTCGTTAAAA TTCAGATATC CGTTCAGAAA ACACCAGGAG 
ATGATTCTTC AAAGCTTTAA AAACGAGCAG CTTAAAAGTA AAGGCGGCCC TTTGCATTTT
CATGTGGTGG CGCCTCCGGG AGCGGGCAAG ACCATTGTTG GACTGGAATG TGTCATAAGA
CTTCAGGTTC CGGCGGTGAT TATATGTCCC AATACAGCCA TTCAGGGACA GTGGATTGAC
AAATTTGATC TTTTCATACC GGAAGACTCA AACATTAACA AGGATGATAT AATAGGTTCG
AATCCCAATT CATTAAAACC CATTAATGTA TTTACATATC AGGTTCTCAG CATACCGGAC
AATGATACGG ATTCATACAG AAGTGTGTCC GAGAATATGT GGGCCGAATC CATAAGCGAG
TCTTTGGGGA TAGCCAAAGA AGAGGCTTTG GACAGGATAC ACAAGATGAG GGAAAAAAAT
CTTGCCGAAT ACAACAAGGA GCTTTCAAAG TACACAAAGA AGCTTAGAAA CAGCGTTTTT
GAAGGGTCTG GCGGAGATTT TCTCAAGATT CTTCATCCCA ACACAAGGGA GCTTATAAAA
AAGCTTAAGG AAATGGGCGC CAAGACCGTT GTGTTTGATG AATGCCATCA CCTCAAAAAT
TATTGGGCCG TTGTGATGAG AGAAATAATA AAAGAAATAG ATGCAAAGAA CATTATCGGG
CTTACTGCAA CTCCGCCTTC GAGCGACGAA GGGGAAAGTT ACGAATGCTA CACCGCACTC
CTTGGAGAGA TAGATTTTCA GCTGCCAACT CCTGCTGTTG TAAAGGACGG CATGCTCTCA
CCCTATCAGG ATTTGGTTTA TTTCTGCATA CCGACACAGG AAGAGCTTAA ATTTATAGAA
GAGACCCATG AGAGGTATAA AAAGCTGATT GAGAAATTTG ATAAAAAGGA TTGTGATTTC
TACAACTGGA TAGAAAAAAG GATTGTTGAA AGAAAACTTG TATCCGGCGA AAAGCAGGCT
TGGACAAAGT TTATCAATTC AAGACCGGGA TTTGCCGTGG CAGGTGTCAA ATACCTTATT
AAAAATGGCT GCAAGCTTCC CTGGGACATA ACCGTAACGG AAGACATGTA TAATGAAATG
TCGGTGGAAG ACTGGTGCTA TCTTATTGAG GATTACGCGC TTCACAAGCT TAAGGTGAGC
GACAGCGAGG AAGACAAGGT GATGTATGAA GACATAAAGC TGGCGTTAAG GAGCCTTGGC
TTTATTCTTA CAGAAAAAGG AATACGAAAT CACAATTCTC CCGTTGACAG GGTTTTGGCT
TACAGCAGAA GCAAGCTTCT GGCTGTAAAG GATATTATAA AAGAAGAAAT GCTCAGTATG
GGGGACAAAC TCAGAGTGGC CGTTATTACG GATTTTGAGA TTTCCAATGC CTTGTCGCTC
AAAAAGGTGA ACAACGTATT GGATGAGGAG AGCGGAGGAG CCGTAAGTGT GTTAAAAGAG
CTTGTTGCCG ACCCTGAAAT CGACAAGCTC AATCCAATAA TGGTAACTGC GAAGAATTTG
CTTTGTGATG ACGATATTGC CGAAAAGTAT GTGGAGATCG GAAATAAATG GGCAAAAGAG
AATTCCCTTG ATATAAAGCT TGAAGTACAG CCGGGGGTTG AAGGATTGTT CTGCGCTATT
GCCGGTTCCG GAAGGGACTG GAATTCAAAG ACGGCTGTGT TGTTTACCAC ATACCTTCTT
GAAGAAGGTG TCACAAAGTG CCTGATAGGT ACAAGAGGAC TTTTCAGCGA GGGCTGGGAC
AGTATTGCGC TAAATACCTT AATTGATCTG TCCACGGCCA CAACCTTTGC CTCAGTCAAC
CAACTCAGGG GAAGAAGTAT CCGAAAGAAC GAGAAAGAGC CCCGGAAACT TGCAAACAAC
TGGGACGTGG TATGCATTGC CCCGGGATTG GAAAGAGGAT ACAACGACCT TGAAAGGCTT
TTGAAAAAGC ACAAACAGTT CTACGGAATA TGTGAGGACG GAAGGATTCA GCAGGGAATT
GACCATGTGG ATGCAGCCCT GTCCTTTGGT GAAACCAAAA TAATGCAGGA AGGAATACAG
AGCATCAACG CAAGAATGTT AAAAAAATTG AGGCAACGGG AAGAAGTCTA CAATGCATGG
AAGATAGGGG AACCATTTTT AAACATTGAA GTGGGCTGCT GTGAATTAAG AATGTCGAAG
CCATGTAAAA TAAAGACTGC GGGCCTTATG AAAAAAGAGT TCGGCACTTT GGGAAGAAAG
CTTAAATTGG GAGTGGCGTG CGGTATAGGA AGCATCGGAG CGATTATGAT GGCGGCGGCA
GGCATGACTT TTGGGCCTTT TGGCACTCTC CCTGTTCTTG CGGCAGGGGT GCTTATGGGA
GTTAAATCCG TTACGGATAT AAGAGGATTC TGGAAATACG GAAATGACCT TTTTATGGGA
CGGCCTGCCA TTGACACTAT AACCGACATA TCCAAATGTT TGTTTTATGC CTTAAAGGAA
TGCGGTTTTA TAAGCCGTGA TTTGTATGAA AGAAAAATAA CTGTTACGGA AAGAGCCGAC
GGAAGCCTGA GGGTTTATCT TGAGGCATCG AAGGAAGACT CGCAGACATT TGCCGCATCT
TTGGCAGAAA TTCTGGCTCC CATAGAGGAC CAGAGATATG CTGTGCAAAG GTATGAGGAA
AAAATGCCCG AGGGTACTTT TGAGAGATTG AATTGCATGA TAGGGTGGGG GTTAAACAAG
TCAAATCCTC AGCTGGTTTG CTATCATCCT CTTCCGTCCC TGTTTAATCA CAAGGAGAAA
GCATTGGTAT TCAAAAAGCA CTGGAACAGG TATGTAAGCC CCGGCGATAT TGTGTATCTT
AAAGGTGAAG AGGGCAAAAA AATTGTGGAG AATTACGGAA GAGTAAACTT CTTCGGTGCA
AAGAAACAGC TGAGTAATAT ATGGATGTAG
 
Protein sequence
MSNVQFDSLK FRYPFRKHQE MILQSFKNEQ LKSKGGPLHF HVVAPPGAGK TIVGLECVIR 
LQVPAVIICP NTAIQGQWID KFDLFIPEDS NINKDDIIGS NPNSLKPINV FTYQVLSIPD
NDTDSYRSVS ENMWAESISE SLGIAKEEAL DRIHKMREKN LAEYNKELSK YTKKLRNSVF
EGSGGDFLKI LHPNTRELIK KLKEMGAKTV VFDECHHLKN YWAVVMREII KEIDAKNIIG
LTATPPSSDE GESYECYTAL LGEIDFQLPT PAVVKDGMLS PYQDLVYFCI PTQEELKFIE
ETHERYKKLI EKFDKKDCDF YNWIEKRIVE RKLVSGEKQA WTKFINSRPG FAVAGVKYLI
KNGCKLPWDI TVTEDMYNEM SVEDWCYLIE DYALHKLKVS DSEEDKVMYE DIKLALRSLG
FILTEKGIRN HNSPVDRVLA YSRSKLLAVK DIIKEEMLSM GDKLRVAVIT DFEISNALSL
KKVNNVLDEE SGGAVSVLKE LVADPEIDKL NPIMVTAKNL LCDDDIAEKY VEIGNKWAKE
NSLDIKLEVQ PGVEGLFCAI AGSGRDWNSK TAVLFTTYLL EEGVTKCLIG TRGLFSEGWD
SIALNTLIDL STATTFASVN QLRGRSIRKN EKEPRKLANN WDVVCIAPGL ERGYNDLERL
LKKHKQFYGI CEDGRIQQGI DHVDAALSFG ETKIMQEGIQ SINARMLKKL RQREEVYNAW
KIGEPFLNIE VGCCELRMSK PCKIKTAGLM KKEFGTLGRK LKLGVACGIG SIGAIMMAAA
GMTFGPFGTL PVLAAGVLMG VKSVTDIRGF WKYGNDLFMG RPAIDTITDI SKCLFYALKE
CGFISRDLYE RKITVTERAD GSLRVYLEAS KEDSQTFAAS LAEILAPIED QRYAVQRYEE
KMPEGTFERL NCMIGWGLNK SNPQLVCYHP LPSLFNHKEK ALVFKKHWNR YVSPGDIVYL
KGEEGKKIVE NYGRVNFFGA KKQLSNIWM