Gene Cthe_2471 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_2471 
Symbol 
ID4809851 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2942646 
End bp2943926 
Gene Length1281 bp 
Protein Length426 aa 
Translation table11 
GC content30% 
IMG OID640107886 
Producthypothetical protein 
Protein accessionYP_001038866 
Protein GI125974956 
COG category 
COG ID 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.000000072504 
Plasmid hitchhikingNo 
Plasmid clonabilitydecreased coverage 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAGTGAGA CAAAGAATGA TATTGCATGG GAACGAATTT TTAAGAAATA TAGAATATTA 
GAGAAAATAA AGAAAAATGG GGCTTTTGAA ATAACGTCAG GGCAAATAAA TGAGTTTAGA
GAAGCAAGGT TAATGACAAA ATTTGATCAC CGAAAAAATT TACCGAAGAT TTTTGAAGAA
AATAATTTTT CTATTCTTCC TATTACTAGA GGTAGTTATT TAATTGCGCA GTTTAAGGCT
TATCATAGGC TTGAGGAAAA AGAAACAGAA ATAATCAAGA TTCCATTTCC TACTTATATT
GAAAGTATTG ATTATGAAAA CATAACAAGC GAGGCTGCGG CTTTAAACTG TGCGTATGTT
TCAGGTATAT TGGCTGATTT TATTGAGGAT GAAGAAATGG TTCCAACAGT TACAGGTCGA
ATGAGTTCTG ATGCGTTTTG TTTTTATATT AATACTTATT CGGGGTCTAA GTTTAAAGTT
AATGTTACTA ATGCTCAAAT TGAGATAGAT GGTGGATATG AAGGGCTGGG AACCTTTTCT
TTAATTGAAG CGAAAAACTC GTTATCAGAT GATTTTATAA TACGACAAAT ATATTACCCT
TATAGGTTAT GGCATGATAA AATTAACAAA AAAGTTAAGC CAATATTTAT GACTTACTCT
AACGGTATTT TTACTTTTTA TGAGTATGAG TTTCAAGACC CTGAAGATTA TAATTCTCTT
ACTTTAGTAA AACAAAAAAA ATATAGCATA GAGGAAACAG AGATTGGGCT TGATGACATA
ATAGAGATCT ACAAAAGGAC AAAAATTATA AATGAACCAG AAGTTCCATT TCCACAAGCA
GATTCATTTG AAAGGATAAT TAATCTTTGC GAGCTTTTAA ATGAATCAGA GTTGACTAGA
GATGAAATAA CAACAAACTA TGATTTTGAC TCTAGGCAAA CGAATTATTA TACAGATGCA
GCTAGATACT TGGGATTAGT ACATAAGCGT AAAGAAGGTA GAGAGGTAAT ATTTTCGTTG
ACAGAAGAGG GGGAAAAATT ATTTAAACTG AAATATAAGC CAAGACAATT AAAATTTGTT
GAATTAATTT TGTCCCACAA AGTTTTTAGA GAAGTTTTTG AATTGTGTCT GAAAAATGGA
AAAATGCCAG ATAAACATGA AGTAGTGAAG ATTATGAGAT ACAGCAATTT GTATAAAATA
GAATCCGAGA AAACATTTTA TAGGCGTGCT CAAACTATAA TGAGTTGGAT TAAATGGATA
TTAGAATTAA CTAGATTGTA G
 
Protein sequence
MSETKNDIAW ERIFKKYRIL EKIKKNGAFE ITSGQINEFR EARLMTKFDH RKNLPKIFEE 
NNFSILPITR GSYLIAQFKA YHRLEEKETE IIKIPFPTYI ESIDYENITS EAAALNCAYV
SGILADFIED EEMVPTVTGR MSSDAFCFYI NTYSGSKFKV NVTNAQIEID GGYEGLGTFS
LIEAKNSLSD DFIIRQIYYP YRLWHDKINK KVKPIFMTYS NGIFTFYEYE FQDPEDYNSL
TLVKQKKYSI EETEIGLDDI IEIYKRTKII NEPEVPFPQA DSFERIINLC ELLNESELTR
DEITTNYDFD SRQTNYYTDA ARYLGLVHKR KEGREVIFSL TEEGEKLFKL KYKPRQLKFV
ELILSHKVFR EVFELCLKNG KMPDKHEVVK IMRYSNLYKI ESEKTFYRRA QTIMSWIKWI
LELTRL