Gene Cthe_1985 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_1985 
Symbol 
ID4810917 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2364837 
End bp2366108 
Gene Length1272 bp 
Protein Length423 aa 
Translation table11 
GC content46% 
IMG OID640107401 
Productphage major capsid protein, HK97 
Protein accessionYP_001038396 
Protein GI125974486 
COG category[R] General function prediction only 
COG ID[COG4653] Predicted phage phi-C31 gp36 major capsid-like protein 
TIGRFAM ID[TIGR01554] phage major capsid protein, HK97 family 


Plasmid Coverage information

Num covering plasmid clones41 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
GTGGAGAAAG CAGAAGCTAT TTTAAACGAA GCGGAAAAAG CGGGTGGAAG TTTGACAAAG 
GAACAGGAGC GACAGTTTAA TCGCTACACA GACAAAATAA AGAGCATTAA TGAAAGCATT
GACGAGGAAT TATTAAATAT CAGAACCTCT GAGCCAATAC TGAATATGCC ACAAAAAGCT
GTATTTCCTA TTGAAGAATC AAAAACTCCT GTAACAAAAG CCGTATCCAA ATCATTCAGA
GGGATGTTCT ATGGCAATGA AACAGTAAAA CTCAGCAACA ATGGATTTCA TTCCATGGAT
GAATTCCTGA GAACACTTCA CTCAGGCAGA GCCGACAACA GGCTAATAAA TGCCAGTATG
GTGGAAGGAA TACCCGAATT CGGCGGATAT TCCGTACCGG AGGAATACGG AGCCTTCCTG
ATGGATAAAT CCCTGGAGAA TGAAATCATC CGTCCCAGAG CAACAGTATG GGCAATGGGA
AGCGAAACAA AGAAAGTACC AGCCTTCGAT GGAGCAGACA GAACCAATCA CCTATTCGGT
GGTATTTCAG GAGAATGGCT GGAGGAAGGT CAGACAGGCA CACGAAAGAC CGCCAAGCTA
AGACTGATCC AATTAAAGGC CAAGAAGCTT GCCTGCTTCT CACAGGCATC CAATGAACTC
ATTGCAGATG GTATGTCCTT TGAAGAAATG CTTGCCGGAG CGCTTATTAA AGGCTTGGGC
TGGTACATGG ACTATGCCTT TATCAATGGA ACCGGTGAAG GTCAGCCTCT TGGTATTATA
AATGACCCGG CGCTGATTAC TGTAGATAAA GAGGACTCTC AAGCAGCAGC CACAATTACC
TATCAAAACG TGGTCAATAT GTTTTCAAGG CTTGCTCCGT CCTGTTTTAC CAATGCGGTA
TGGCTTGCCA ATCCATCGGT AATACCACAA TTGCTTACCA TGACTATCAC CATTGGTACC
GGTGGCGCTC AGATACCGGT GTTCAGGGAA GAGAGCGGGA AATTCACGCT TCTGGGTAAG
GAGGTCTTAT TCACTGAGAA ATGCCCCGCA TTGGGTGCTA AGGGAGATTT AATCCTCGCA
GACCTTAGCC AGTATGCCAT AGGCATGAGG AAAGAGATCG CTCTTGACCG CTCCAATGTC
CCAGGCTGGA TGGAGGATAT GACCGACTAC AGGGTGATAG TGCGTGTAGA TGGTCAGGGA
ACCTGGGATA AACCTATAAC ACCGAAAAAC GGAGCAACGC TCTCATGGGC AGTGGCTTTG
GAGGCAAGAT AG
 
Protein sequence
MEKAEAILNE AEKAGGSLTK EQERQFNRYT DKIKSINESI DEELLNIRTS EPILNMPQKA 
VFPIEESKTP VTKAVSKSFR GMFYGNETVK LSNNGFHSMD EFLRTLHSGR ADNRLINASM
VEGIPEFGGY SVPEEYGAFL MDKSLENEII RPRATVWAMG SETKKVPAFD GADRTNHLFG
GISGEWLEEG QTGTRKTAKL RLIQLKAKKL ACFSQASNEL IADGMSFEEM LAGALIKGLG
WYMDYAFING TGEGQPLGII NDPALITVDK EDSQAAATIT YQNVVNMFSR LAPSCFTNAV
WLANPSVIPQ LLTMTITIGT GGAQIPVFRE ESGKFTLLGK EVLFTEKCPA LGAKGDLILA
DLSQYAIGMR KEIALDRSNV PGWMEDMTDY RVIVRVDGQG TWDKPITPKN GATLSWAVAL
EAR