Gene Cthe_2117 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_2117 
Symbol 
ID4810977 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2514578 
End bp2515606 
Gene Length1029 bp 
Protein Length342 aa 
Translation table11 
GC content46% 
IMG OID640107524 
Productputative sulfonate transport system substrate-binding protein 
Protein accessionYP_001038517 
Protein GI125974607 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG0715] ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones31 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
TTGAAAAGAA GAAATGTTTT AACGGCAATA TTCATGGCAA TTATACTTGC CTTTGCCGCT 
ACAAGCTGCG GTATTGAGAG AACTCCGTCA CCGTCTCCGA ACCAGACTTC TTCCGCAGAC
GGAAAACAAA AAATCAGTTT CGTCCTTGAC TGGGTTCCCA ACACAAATCA CACCGGAATC
TATGTTGCAA AAGTAAAGGG ATATTTTGCC GAAGAGGGAC TGGATGTTGA TATAATCCAG
CCCGGTGAGT CATCCGCCGA CCAAATGGTG GCCACCAACA CAGCGCAGTT CGGAATCAGC
TACCAGGAAG GAGTTACTTT CGCCCGTGCT TCAGGAGCGC CCTTGGTATC CCTGGCAGCG
GTTATACAAC ACAATACCTC CGGCTTTGGC TCGCCTAAAG ACAAACAAAT CGTTTCCCCA
AAGGATTTTG AAGGCAAAAA ATACGGAGGA TGGGGTTCGG AAGTTGAAGA GGCCATGGTA
AGGCAAGTTG TGAAAGATGC AGGGGGAGAC CCGGACAAAG TGCAAATCGT TACCATCGGC
ACAGCGGATT TCCTTCAGGC ATGTGAAACC GGCCAAATTG ATTTTGCCTG GATTTTCGAA
GGCTGGGATT TCATCAATGC CGCCAACAAG GGAGTCGAAC TAAACTATAT TCCCTTAAGG
GAACTTTCCG AAACCTTTGA CTACTACACC CCGGTAATTG TTACAAATGA GGATAACATA
AAGAATAATC CCGAACTTGT GAAAAAATTC ATGAGAGCCG TGGAAAAGGG CTATAAATTC
GCCATGGAAA ATCCCGATGA AGCGGCGGAA TGCCTGCTTC AACTGGCACC GGAGCTTGAC
CGGAAGCTTG TGGTAAAAAG CCAGCGCTTT TTGGCCTCAA AGTATCAGGA TGACGCTCCC
TATTGGGGAA TGCAGAAAAA AGAAGTATGG GAAAGGTACA TGAACTGGCT TTACGAAAAT
AAATTCATAG ATGCCCCCAT AGATGTGGAA AAGGCCTTTA CCAACGACTT TTTGCAAAAT
GGACAGTAA
 
Protein sequence
MKRRNVLTAI FMAIILAFAA TSCGIERTPS PSPNQTSSAD GKQKISFVLD WVPNTNHTGI 
YVAKVKGYFA EEGLDVDIIQ PGESSADQMV ATNTAQFGIS YQEGVTFARA SGAPLVSLAA
VIQHNTSGFG SPKDKQIVSP KDFEGKKYGG WGSEVEEAMV RQVVKDAGGD PDKVQIVTIG
TADFLQACET GQIDFAWIFE GWDFINAANK GVELNYIPLR ELSETFDYYT PVIVTNEDNI
KNNPELVKKF MRAVEKGYKF AMENPDEAAE CLLQLAPELD RKLVVKSQRF LASKYQDDAP
YWGMQKKEVW ERYMNWLYEN KFIDAPIDVE KAFTNDFLQN GQ