Gene Cthe_1010 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_1010 
Symbol 
ID4811304 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp1207340 
End bp1208575 
Gene Length1236 bp 
Protein Length411 aa 
Translation table11 
GC content43% 
IMG OID640106428 
Productpeptidase U32 
Protein accessionYP_001037435 
Protein GI125973525 
COG category[O] Posttranslational modification, protein turnover, chaperones 
COG ID[COG0826] Collagenase and related proteases 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones29 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAAAAAAG TTGAACTGCT TGCTCCCGCA GGCAATCTTG AAAAACTTAA AATGGCCGTT 
TTATATGGTG CGGACGCCGT TTACCTGGGC GGTGAGGAAT TCAGCCTCAG AGCTTATGCC
GAGAATTTTA CATTGGATGA GCTGAAAGCA GGAGTGGAAT TTGCCCATAG CAAAGGAAAA
AAAGTATATG TAACCATTAA TATATTCCCT CACAATGATG ATTTGAAGAA AATACCGGAA
TATATAAAAG AAGTTGCAGG GATCGGAGTC GATGCCATAA TCCTCTCAGA CCCCGGCATT
CTCTCCATTG TGAAAGAAAT AGCTCCGGAT ATGGAAATAC ATTTAAGCAC CCAGGCCAAC
AATACTAATT TTATGAGTGC CAGATTTTGG CACAATCACG GTGTAAAACG GATAATACTT
GCAAGAGAGC TTTCCCTTGA GGAAATCCGG GAAATAAGAG AAAAAACTCC TGACTCTCTG
GAGCTTGAAG TTTTTGTCCA CGGTGCCATG TGCATATCCT ATTCCGGAAG GTGTCTTCTC
AGCAATTACA TGGCCGGCAG GGATTCCAAC AGGGGACTGT GCGCACATCC ATGCAGGTGG
AAATATTACT TGATGGAGGA AAAAAGACCC GGTGAATACT ACCCGGTATA TGAAAATGAA
AGAGGCACAT TCATTTTCAA CTCCAGGGAC CTCTGTATGA TTGAGCACAT ACCGGAATTG
GTGGAATCCG GAGTTTCCAG CTTTAAAATT GAAGGCCGCA TGAAAAGCTC TTTCTACGTC
GCAACGGTCG TAAAAGCATA TCGCGAAGCA ATAGATGCCT ATTATGAGGA TAAAGACAAC
TATAAATTCG ATCCCAGGCT TTTGGAGGAA GTCTGCAAAG TCAGTCACAG GGAATTCACC
ACCGGCTTTT TCTTCAACAA GCCCGGCCCG AAAGACCAGA TTTACGCCAC CAGCTCATAT
ATAAGAGAGT ATGACTTTGT AGGGGTTGTT CAAAAATATG ACAAAGCAAC AAAAATAGCA
ACCGTAGAAC AAAGAAACCG CATGTACAAA GGTGAGGAAA TAGAAGTTGT AAATCCCAAA
GGCAATTTTT TTGTTCAGAA AATTGAATGG ATGAAAAATG CCGACGGTGA AGACATAGAC
GTTGCCCCCC ACCCTCAAAT GACGGTATAT ATGCCGATGA AAGAGGATGT GGAAGAATTT
GCAATGCTCA GGCGAAAAAG CAGTCCAAAT AAATAA
 
Protein sequence
MKKVELLAPA GNLEKLKMAV LYGADAVYLG GEEFSLRAYA ENFTLDELKA GVEFAHSKGK 
KVYVTINIFP HNDDLKKIPE YIKEVAGIGV DAIILSDPGI LSIVKEIAPD MEIHLSTQAN
NTNFMSARFW HNHGVKRIIL ARELSLEEIR EIREKTPDSL ELEVFVHGAM CISYSGRCLL
SNYMAGRDSN RGLCAHPCRW KYYLMEEKRP GEYYPVYENE RGTFIFNSRD LCMIEHIPEL
VESGVSSFKI EGRMKSSFYV ATVVKAYREA IDAYYEDKDN YKFDPRLLEE VCKVSHREFT
TGFFFNKPGP KDQIYATSSY IREYDFVGVV QKYDKATKIA TVEQRNRMYK GEEIEVVNPK
GNFFVQKIEW MKNADGEDID VAPHPQMTVY MPMKEDVEEF AMLRRKSSPN K