Gene Cthe_0678 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_0678 
Symbol 
ID4810296 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp833900 
End bp835201 
Gene Length1302 bp 
Protein Length433 aa 
Translation table11 
GC content44% 
IMG OID640106095 
Productthymidine phosphorylase 
Protein accessionYP_001037106 
Protein GI125973196 
COG category[F] Nucleotide transport and metabolism 
COG ID[COG0213] Thymidine phosphorylase 
TIGRFAM ID[TIGR02644] pyrimidine-nucleoside phosphorylase 


Plasmid Coverage information

Num covering plasmid clones22 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAGAATGG TTGATCTTAT AAACAAAAAA AAGCGGGGAG AAGCTCTGTC CGCCGCTGAA 
ATTGATTATA TTGTCCAAGG CTATACAAAG GGCGAAATAC CGGACTATCA GATGTCGGCA
TTTTTGATGG CTGTATATTT CAAAGGAATG AACAGAGAAG AGACGGCAAA CCTTACTTTA
TCCATGGTCA ATTCCGGTGA AACGGTTGAC CTTTCGATGA TTGAAGGAAT AAAGGTTGAC
AAGCATTCTT CCGGTGGCGT TGGAGACAAA ATCAGTCTTG TAATAGTTCC GCTTTGTGCC
TGTGTTGGGA TACCGGTTGC AAAAATGTCC GGAAGAGGGC TTGGGCACAC CGGCGGAACA
ATTGATAAGC TGGAATCCAT AGAAGGATTT AGAACCGAGC TTACGAAAGA GGAGTTTGTG
AACAACGTAA ACAAATATAA AATGGCCATA GTAGGCCAAT CGCCAAATCT CACTCCTGCG
GACAAAAAAA TATATGCTCT GAGAGATGTT ACCGGTACGG TGGACAGCAT ACCGCTTATA
GCAAGCTCAA TAATGAGTAA AAAAATCGCC TCCGGGTGCG ATTGCATTGT CCTGGATGTC
AAGGTGGGAT CCGGAGCCTT CATGAAGTCC GTGGACGAAG CCGTGATTTT GGCAAAAACC
ATGGTGGAAA TAGGCAAAGC TTTGGGAAGA AGAACTGTTG CGGTTGTAAC AGACATGAGC
CAGCCTTTGG GATATGAAGT GGGAAACGCC AACGAAGTTA AAGAAGCAAT AGAAATATTG
AAGGGCCACG GTGCCGAGGA CGAGACAACG GTGGCACTCA CAATTGCATC CCATATGGCG
GTATTGGGCG GTGCTTTTTC AGATTATGAA TCGGCTTACA ACCATATGCG CAAATTGATA
GAATCCGGCA AGGCAGTGGA AAAATTAAAG GAATTAATCA GAATACAGGG GGGAAATACC
GATGTGGTGG ACAACCCAAA TCTTTTACCC CAGGCCGAAA AACACATAGA AGTTAAATCC
TCAACGGCAG GTTATATAAA TTCTGTCAAT GCCGAGGACA TAGGAGTTTC GGCAATGCTT
CTTGGAGCAG GAAGAAAGAC CAAAAACGAC AGCATAGATT TTTCAGCGGG CATCACAATG
GTAAAAAAGA TTGGGGATTG GGTTGATGAA GGTGATACTT TGTGCATACT TCACACAAAC
AAGTCCGACT TTCAAGAGGC AGAAAGGCTT TCCAAAAATG CTTTTGTCAT AAAAAACACG
AAACCTGAAC CGATTAAATA TGTTCACTGT GTTATTGACT GA
 
Protein sequence
MRMVDLINKK KRGEALSAAE IDYIVQGYTK GEIPDYQMSA FLMAVYFKGM NREETANLTL 
SMVNSGETVD LSMIEGIKVD KHSSGGVGDK ISLVIVPLCA CVGIPVAKMS GRGLGHTGGT
IDKLESIEGF RTELTKEEFV NNVNKYKMAI VGQSPNLTPA DKKIYALRDV TGTVDSIPLI
ASSIMSKKIA SGCDCIVLDV KVGSGAFMKS VDEAVILAKT MVEIGKALGR RTVAVVTDMS
QPLGYEVGNA NEVKEAIEIL KGHGAEDETT VALTIASHMA VLGGAFSDYE SAYNHMRKLI
ESGKAVEKLK ELIRIQGGNT DVVDNPNLLP QAEKHIEVKS STAGYINSVN AEDIGVSAML
LGAGRKTKND SIDFSAGITM VKKIGDWVDE GDTLCILHTN KSDFQEAERL SKNAFVIKNT
KPEPIKYVHC VID