Gene Cthe_1020 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_1020 
Symbol 
ID4811314 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp1220262 
End bp1221641 
Gene Length1380 bp 
Protein Length459 aa 
Translation table11 
GC content42% 
IMG OID640106438 
Productextracellular solute-binding protein 
Protein accessionYP_001037445 
Protein GI125973535 
COG category[G] Carbohydrate transport and metabolism 
COG ID[COG1653] ABC-type sugar transport system, periplasmic component 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.0000000124434 
Plasmid hitchhikingNo 
Plasmid clonabilityunclonable 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGCTCAAGA AGGTAATCGC ATTAATGTTG GTTGCTGTTA TGGCTTTAAG TCTGGCAGCA 
TGTGGTGGTG GAGGAGGAAA TACTACGACT TCACCGCAAC CAAACGATTC CCAAAATTCG
CCTGATTCAG GAACAAAGAA GGACCCAGTA AAATTGACCA TGTGGATCAT GCCTAACAGT
GACACACCGG ACCAGGATCT TTTGAAAGTT GTTAAGCCAT TCACAGATGC TAATCCTCAT
ATCACAGTTG AACCTACAGT TGTTGACTGG AGTGCAGCTT TGACAAAGAT CACAGCTGCT
GCTACAAGTG GTGAAGCTCC TGACATTACA CAGGTTGGTT CCACTTGGAC AGCTGCTATC
GGTGCAATGG AAGGTGCATT GGTTGAGCTT ACCGGAAAAA TCGATACAAG TGCTTTCGTT
GAATCAACTC TGCAGTCAGC TTATATCAAA GGCACAGACA AGATGTTCGG TATGCCTTGG
TTTACTGAAA CAAGAGCTCT CTTCTACAGA AAAGACGCTT GCGAAAAAGC AGGTGTAAAT
CCTGAAACAG ATTTCGCAAC TTGGGACAAA TTCAAAGATG CTCTCAAGAA ACTCAACGGT
ATTGAAGTTG ACGGCAAGAA ACTGGTTGCA CTGGGTATGC CGGGTAAGAA CGACTGGAAC
GTTGTTCATA ACTTCTCATG GTGGATTTAC GGTGCCGGCG GAGACTTTGT AAACGAAGAA
GGTACACAAG CTACTTTCTC AAGCGAAAAT GCTCTTAAAG GTATCAAATT CTATTCAGAA
CTTGCTGTTG AAGGTTTGAT GGATGAGCCT TCACTTGAAA AGAATACAAG TGACATTGAG
TCCGCATTTG GTGACGGTGC ATACGCTACT GCATTCATGG GTCCTTGGGT TATTTCATCT
TACACAAAGA ATAAAGAAGA AAACGGTAAC GACCTTATCG ACAAAATTGG TGTTACTATG
GTTCCTGAAG GACCTGCAGG AAGATATGCA TTCATGGGTG GAAGTAACCT TGTAATATTC
AACTCATCAA AGAACAAGGA TGAAGCCGTT GAACTTCTCA AGTTCTTTGC TAGCAAAGAA
GCTCAGGTTG AATACTCAAA GGTTAGCAAG ATGCTTCCGG TTGTTAAAGC GGCTTACGAA
GATCCATACT TTGAAGATTC ATTGATGAAA GTATTCAAAG AACAGGTAGA CAAATATGGT
AAACACTATG CATCAGTTCC TGGTTGGGCT TCTGCAGAAG TTATCTTCTC AGAAGGTCTC
AGCAAGATCT GGGATAACGT TATGGAAGTT GATGGTGCAT ACAGCTACGA CAAGACTGTA
CAAATCGTAA AAGATGTTGA AAGTCAAATC AACCAAATAT TGCAAGAAAC AAGCAAATAA
 
Protein sequence
MLKKVIALML VAVMALSLAA CGGGGGNTTT SPQPNDSQNS PDSGTKKDPV KLTMWIMPNS 
DTPDQDLLKV VKPFTDANPH ITVEPTVVDW SAALTKITAA ATSGEAPDIT QVGSTWTAAI
GAMEGALVEL TGKIDTSAFV ESTLQSAYIK GTDKMFGMPW FTETRALFYR KDACEKAGVN
PETDFATWDK FKDALKKLNG IEVDGKKLVA LGMPGKNDWN VVHNFSWWIY GAGGDFVNEE
GTQATFSSEN ALKGIKFYSE LAVEGLMDEP SLEKNTSDIE SAFGDGAYAT AFMGPWVISS
YTKNKEENGN DLIDKIGVTM VPEGPAGRYA FMGGSNLVIF NSSKNKDEAV ELLKFFASKE
AQVEYSKVSK MLPVVKAAYE DPYFEDSLMK VFKEQVDKYG KHYASVPGWA SAEVIFSEGL
SKIWDNVMEV DGAYSYDKTV QIVKDVESQI NQILQETSK