Gene Cthe_1930 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_1930 
Symbol 
ID4810788 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2302267 
End bp2303784 
Gene Length1518 bp 
Protein Length505 aa 
Translation table11 
GC content42% 
IMG OID640107346 
Productcarboxyl-terminal protease 
Protein accessionYP_001038341 
Protein GI125974431 
COG category[M] Cell wall/membrane/envelope biogenesis 
COG ID[COG0793] Periplasmic protease 
TIGRFAM ID[TIGR00225] C-terminal peptidase (prc) 


Plasmid Coverage information

Num covering plasmid clones10 
Plasmid unclonability p-value0.0674113 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGGATGTGA CGGGTAAATC ATTAAAACGG ATACTGTTAT CTTTGGCTGT TTTTTGTATT 
TTAATAACGG GCCCGGGAAT TGCCTGTGCC GAGGAAGCCA CGACGCAAGA AAAGGAAATT
TTGTTTTCCG ACTATTTCAA AAGCATGATG GACATGGCGC AAGACAAATA CAAAGGTGAA
ATAACCGAAA AGCAAATGCT GGAAGGCGCG CTGAAAGGCA TATTCAGCAC AATGGATTCT
TACACCGTCT ATTACACTGT GGAAGAGTCG CAAGACTTTT TTACCGATAT AAACGGCTCT
TACACGGGAA TAGGTGTGGT AATGTCGGAA GTTGACGGTA AAATAGTGAT AGACAAGGTG
TATCCGTCCT CACCGGCGGA GGAAGCAGGG ATAAAAAAAG GCGATGTTAT AGCCCAGGTT
GACGGTAAAA GCGTTGAGAA CCTTTCTTTG GAAGAAGTGG CCGGGCTCAT AAAGGGACCG
TCGGGTACGA AAGTTGTAAT CGGTGTGTTA AGAAACGGAA CAGACGGGGT AATAGAGCTG
GAAGTTACAA GAAGGCAGAT AATTATAAAT CCTGTCACCC ATAAGATAGA AGGCGATATA
GGTTATATAA AGCTGGAATC GTTCAATTCC AATGCAAGCA AGGCTATGGA AGAAGCCTTG
AAACAAATGG ATAAAAACAA TATAAAAAAG ATAATTCTTG ATTTACGGGA CAATCCGGGC
GGAGATGTGG GCCAGGCGGT TTCAATTGCC AGGAAGTTTG TAAAAAAAGG CCTTATTACA
AAGCTGGATT TTAAATCGGA ATCCCAAAAG GATGAAGAGT ATTATTCCTA TTTGGAGGAA
TTAAAATATA AACTTGTGGT ATTGGTGAAC GAAAACAGCG CAAGCGCTTC GGAAATATTG
GCGGGAGCGA TACAGGATAC AGGTTCGGGT ATTTTGGTTG GTACAACAAC CTTTGGAAAG
GGAAAAGTCC AGAATCTTTA TCCCATACTG ACTCCGGAGG CGGTGGAAAA ATATCGGAAA
GAAACCGGAG AAACATTTGT GAACGGATAC GATTTATTGG AAAAACACGG CATCTATCCT
TCCGATGAAG AAATAATCGG ATGGGTGAAA ATAACAACCG GGGAATATTA TACCCCCAAC
GGAAGGATGA TAGACGGAGT AGGTCTTGAG CCCGACGTTT ATGTTGAAAA TGAACCGGAG
GGAAAATATA AAATCCTTGA AGGTGTGGAA AAACTTCGCA AGGTGACAAA ACCTTCTTTA
AATGCTCAAA GTGAGGATGT CCTGAATGCG GAAAAAATTT TATCGGCACT GGGGTATGAT
GTTGACACTC CGGACAATTT AATGGATGAA AAGACCGTCA AGGCGGTGGC AGAGTTTCAG
AGAGACTGCG GATTGTATTC TTACGGAGTT TTGGACTTTG CCACACAGCA GGCGCTGAAT
GACAAATTGG ATGAATTGCT TCTTGTAAAG AACAGAGACA AACAGTATGA AAAGGCGGTG
GAACTGCTTG AAAATTAG
 
Protein sequence
MDVTGKSLKR ILLSLAVFCI LITGPGIACA EEATTQEKEI LFSDYFKSMM DMAQDKYKGE 
ITEKQMLEGA LKGIFSTMDS YTVYYTVEES QDFFTDINGS YTGIGVVMSE VDGKIVIDKV
YPSSPAEEAG IKKGDVIAQV DGKSVENLSL EEVAGLIKGP SGTKVVIGVL RNGTDGVIEL
EVTRRQIIIN PVTHKIEGDI GYIKLESFNS NASKAMEEAL KQMDKNNIKK IILDLRDNPG
GDVGQAVSIA RKFVKKGLIT KLDFKSESQK DEEYYSYLEE LKYKLVVLVN ENSASASEIL
AGAIQDTGSG ILVGTTTFGK GKVQNLYPIL TPEAVEKYRK ETGETFVNGY DLLEKHGIYP
SDEEIIGWVK ITTGEYYTPN GRMIDGVGLE PDVYVENEPE GKYKILEGVE KLRKVTKPSL
NAQSEDVLNA EKILSALGYD VDTPDNLMDE KTVKAVAEFQ RDCGLYSYGV LDFATQQALN
DKLDELLLVK NRDKQYEKAV ELLEN