Gene Cthe_2334 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_2334 
Symbol 
ID4809262 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2781498 
End bp2783342 
Gene Length1845 bp 
Protein Length614 aa 
Translation table11 
GC content41% 
IMG OID640107741 
Productpolysaccharide biosynthesis protein CapD 
Protein accessionYP_001038729 
Protein GI125974819 
COG category[G] Carbohydrate transport and metabolism
[M] Cell wall/membrane/envelope biogenesis 
COG ID[COG1086] Predicted nucleoside-diphosphate sugar epimerases 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.00000027 
Plasmid hitchhikingNo 
Plasmid clonabilityunclonable 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAAACAAA AGATTAGAGC AAATGGATTA ATTGCAATTG ATGTGTTTTT AGTAAATATA 
TCGCTTGTTA TTGCATATCT TTTGAGGTTT GACTTAAACT ATGCAAATAT TCCGGAACGT
TTTATCGAAC CCTTGGTAAA GCTCGCCGTA ATTTCAACAG TTGTTAAACT TATAACTTAT
GCTGTGATGA AGCTTTACAA CAGTTTGTGG AAGTATGCGG GAATATATGA AGTGGTTATG
GTTATAGCTG CTTCATTTAT CAGCAATTCA ATAATGATAA GCTATGTGTT TTTATCCCAA
ACGCCCGTAC CTCGAAGTAT ATTCCTGATA TGCATCTTGA CGGATATTGC CCTGATTGGG
GGAGTACGGT TTGCTTACAG GGTGTTTAGA AGAATTGTCA AAGGGGAAAT GATTCGTTTT
ACAAATTCAA AGAGGGTGCT GATTGCAGGC GCCGGGGACG CCGGCGCGAT AATTGTGAAG
GAAATGAAGT CCCATCCTAA ACTTAAAAGC ACTCCGGTGG CGTTTGTAGA TGACGATAAA
TACAAAATAG GCAAAAAAAT CAGCGGAGTG CCTGTGGTGG GTGAGACAAA GGATGTTTCA
GAAATAATAC AGAAGCTGCA GATAGACGAA GTCATTATTG CCATGCCGTC TGCCAGCCAC
AAAAAAATAA ATGATATATA CACCGGCTGT GCCCAAACAA ACTGCAAGGT AAAGATACTT
CCGTCGGTTT CACAGCTGAT TGATGAATCT GTTGTCATGC AGAAAATAAG GGATGTGAAT
ATAGAAGACC TTTTGGGAAG GGAACCTGTA CATCTTGATA TTGAAGAGAT AGGTTCGTAT
CTGAAGGACC GGGTGGTAAT GGTTACCGGG GGAGGAGGCT CAATCGGTTC GGAGCTGTGC
AGGCAGATAG CGAATTTTTT GCCGAAGAGG CTTGTTATTT TGGATAATTA TGAAAACAAT
GCATATGACA TACAGAATGA ACTTTTGTAC AAGCATCCGG ATTTAAAGCT TGACACAGTG
ATTGCCAATA TCAGAGAAAA ACAGAGAATG GAAAACATAT TTAAAAAATA CAGGCCAGAC
GTTGTATTTC ATGCTGCGGC CCATAAACAT GTCCCTCTTA TGGAGGACAA TCCTACGGAG
GCGATAAAAA ATAATGTGTT TGGCACCTTA AATGTGGCTG AATGTGCGGA CAAGTATGGA
ACTAAAAGGT TTGTACTTAT ATCCACGGAC AAGGCCGTCA ACCCTACGAA TATTATGGGT
GCGACAAAAA GAATTGCAGA AATGATTATC CAGGCGATGA ATGAAAACAG CAAAACCGAG
TTTGTTGCGG TAAGGTTTGG AAACGTGCTT GGAAGCAACG GAAGTGTTGT ACCACTGTTT
AAAAAGCAGA TAGAAAGGGG AGGACCGGTA ACGGTCACCC ATCCGGAAAT CACCAGATTC
TTTATGACAA TTCCGGAGGC TGTGCAGCTG GTACTACAAG CTGGGGCAAT GGCAAAGGGC
GGAGAAATAT TTGTTCTTGA TATGGGTGAA CCTGTGAAAA TAAGTGAGCT TGCCCGAAAC
TTGATAAGAC TTTCGGGTTT TGAGCCGGAT GTCGATATAA AGATTGAGTA TATAGGTTTG
AGACCCGGCG AAAAGCTTTA TGAGGAACTT ATGCTCAGTG AGGAAGGGCT TTTGGCAACC
AAAAACGACA AAATATATGT TACAAAGCCT GTTCATATTG ATTTTAAGGT GCTCCAGAGG
GAGCTGGATT GCCTTAAGGA TATTATGGTT ACAAATTCTG AAGAGATTTC AGAATATATT
AAACTTATTG TGCCTACGTA CAAGAAAGCG GGAAACGGAA ATTAA
 
Protein sequence
MKQKIRANGL IAIDVFLVNI SLVIAYLLRF DLNYANIPER FIEPLVKLAV ISTVVKLITY 
AVMKLYNSLW KYAGIYEVVM VIAASFISNS IMISYVFLSQ TPVPRSIFLI CILTDIALIG
GVRFAYRVFR RIVKGEMIRF TNSKRVLIAG AGDAGAIIVK EMKSHPKLKS TPVAFVDDDK
YKIGKKISGV PVVGETKDVS EIIQKLQIDE VIIAMPSASH KKINDIYTGC AQTNCKVKIL
PSVSQLIDES VVMQKIRDVN IEDLLGREPV HLDIEEIGSY LKDRVVMVTG GGGSIGSELC
RQIANFLPKR LVILDNYENN AYDIQNELLY KHPDLKLDTV IANIREKQRM ENIFKKYRPD
VVFHAAAHKH VPLMEDNPTE AIKNNVFGTL NVAECADKYG TKRFVLISTD KAVNPTNIMG
ATKRIAEMII QAMNENSKTE FVAVRFGNVL GSNGSVVPLF KKQIERGGPV TVTHPEITRF
FMTIPEAVQL VLQAGAMAKG GEIFVLDMGE PVKISELARN LIRLSGFEPD VDIKIEYIGL
RPGEKLYEEL MLSEEGLLAT KNDKIYVTKP VHIDFKVLQR ELDCLKDIMV TNSEEISEYI
KLIVPTYKKA GNGN