Gene Cthe_1800 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_1800 
Symbol 
ID4809784 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2126418 
End bp2127929 
Gene Length1512 bp 
Protein Length503 aa 
Translation table11 
GC content48% 
IMG OID640107214 
Productpeptidoglycan-binding LysM 
Protein accessionYP_001038214 
Protein GI125974304 
COG category[R] General function prediction only 
COG ID[COG3858] Predicted glycosyl hydrolase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones12 
Plasmid unclonability p-value0.199953 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGTGGTATA CCGTACAGCC GGGAGATTCT CTTTATACCA TTTCCCAAAG ATTTGGAGTA 
ACAATTGCAC AGATAAAAAG TGCCAACCAA CTTACAAGCG ATATTATCTA TGTAGGTCAG
CGTCTATACA TTCCTATAGG AATTCAAGCA CCTGTAGTAT ACACCGTAAG ACCCGGTGAT
ACACTGTATC TGATAGCCCG AAGATACAAC ACCACCGTGG ACAGTCTTAT GGCTCTTAAC
AATCTTAGCA GTACTGAGCT GAGAATCGGC CAGCAGCTAA CCATCCCCCT TTATACCGAA
GCAGTGGTCA ATGTGGGTAC CGCAAACATC CGCAGAGGCC CGGGAACCAA CTTTGGCATA
ATTACTCGCA TGACAAACGG TGCAAGGCTT CCGGTTATCG GTTTTAGCAA CAACTGGTAC
CAGGTACGTC TTTACAACGG AAGAGAAGGC TGGATTTCGG GAAGCATCGT CACCCGCAAT
GTTTACAGCG GACGCAGGCC TATAACAGGA GTACTGGGAT TCTACACCCT TGAGGAAGGT
CCCACCCTCC CAAGCTCTTT TACATCCTTT GCAAACAATA CAGGGCAGCT TTCCTCAACC
GCATTGTTCA TGTTCAGAAT CAGCGCCGCC AACCCAACCA CCATCGAAAA ATTTGGAGAG
TTTACCGACC AGGATGTTCG CAACCTGGTG GCAATTGCCC ACAGGAACAA TGTAAAAATT
ATGCCTGTGG TTCACAACCT GCTGTACAGA CCCGGGGGAA CCACTCTTGC CAAAAACGTT
GTAAAAACTT TGGTCTCAGA CCCAAGAAAC AGAAATGCCT TTGCGCTGAA CCTTGTAAAT
CTCATAGAAA GATACGGCTT TGACGGTGTA AATATTGACA TCGAGGATGT GTTTATAGAA
GACAGCGACA ATCTTTCCCT CCTGTACACC GAGATATCCG AAGTCCTGAG GCCAAGAGGA
TATTTCTTCT CTGCATCGGT TCCTTCAAGG GTAAGCGATG AACCCTTCAA TCCTTTCTCC
GATCCGTTTA ACTACAGTGT GATCGGAAGA GCGGTGGACG AATTTGTCGT AATGCTATAC
AACGAATTCG GATGGCCGGG AAGCCCGCCG GGACCTGCGG TCACCATAGG TTGGATGGAA
CGCGTGCTAA GATACACCAT GAGCAAAATG CCGAGGGATA AGATTATGGC AGCCGTGTCC
GTGTTTGGAT TCGACTTTAA TCTCACCACA GGCCGAAACA CCTATGTGAC TTACCAGTCG
GCGATCAACC TTGCCCGAAG GTACAACAGT GAAATTATTT TCAACGAGGA AAGACAGACG
CCCATGTTTA CCTACAGAGA CGCACAGGGA AATCAGCACG AAGTATGGTT CGAAGATGCC
CGAAGCCTCA GATCCAAAAT TCAGCTGGCC TGGGAACTCG GCATAAAGGG CGTTGCTTTG
TGGAGACTTG GGATGGAAGA CCCAAACATC TGGCCAATGC TTCGAAATGA AGTGGTGGTA
AGGAAGTTTT AA
 
Protein sequence
MWYTVQPGDS LYTISQRFGV TIAQIKSANQ LTSDIIYVGQ RLYIPIGIQA PVVYTVRPGD 
TLYLIARRYN TTVDSLMALN NLSSTELRIG QQLTIPLYTE AVVNVGTANI RRGPGTNFGI
ITRMTNGARL PVIGFSNNWY QVRLYNGREG WISGSIVTRN VYSGRRPITG VLGFYTLEEG
PTLPSSFTSF ANNTGQLSST ALFMFRISAA NPTTIEKFGE FTDQDVRNLV AIAHRNNVKI
MPVVHNLLYR PGGTTLAKNV VKTLVSDPRN RNAFALNLVN LIERYGFDGV NIDIEDVFIE
DSDNLSLLYT EISEVLRPRG YFFSASVPSR VSDEPFNPFS DPFNYSVIGR AVDEFVVMLY
NEFGWPGSPP GPAVTIGWME RVLRYTMSKM PRDKIMAAVS VFGFDFNLTT GRNTYVTYQS
AINLARRYNS EIIFNEERQT PMFTYRDAQG NQHEVWFEDA RSLRSKIQLA WELGIKGVAL
WRLGMEDPNI WPMLRNEVVV RKF