Gene Cthe_1719 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_1719 
Symbol 
ID4808894 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp2039699 
End bp2040988 
Gene Length1290 bp 
Protein Length429 aa 
Translation table11 
GC content37% 
IMG OID640107132 
ProductHK97 family phage major capsid protein 
Protein accessionYP_001038133 
Protein GI125974223 
COG category 
COG ID 
TIGRFAM ID[TIGR01554] phage major capsid protein, HK97 family 


Plasmid Coverage information

Num covering plasmid clones55 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGTATATGT CAGAAAAAAT GAAAGAATTA TTAGCTCAGT TATCAAATTT AGAAACAGAA 
TCTAAAAACC TTATAAATAA AGAGGAGGCT ACAGCTGATG AAATTAATGC AAAGCTTTCT
GAAATTAAGG CTTTAAAAGC TAAAATTGAG GCACAAAAAG AAATTGATGC ATTAAATGCA
GAAAGAGAAA AACAGGCTAA GACGCCAGTG AATGAACCAA TATATGCTCA GCCAAAGAAT
CACAATGAAA AGAAGTGGAA GTGCATGGGA GAATTTTTAA GTGCCGTTGC AAAGGCTTCA
TCTCCCGGAG GAAGAATGGA CAACAGATTA ACTTATCAGA ACTCAGCAAC AGGACTTAAT
GAAAGCATAG CTTCAGAAGG AGGATTTTTA CTAGAAAATG AGTTTATAAA TGACTTATTT
GAATCCATGA TGGCACAAAG TCAGGTGGCA AACAGAATAA GAATGATACC AATAGGGGCT
AATACCAATA GACTTAGGGC GCTTGGAATT GATGAAAACA GCAGAGCCAA TGGCTCAAGA
TGGGGAGGTG TACAGGCTTA CTGGGTAGCT GAAGCAGAAA CAGCAGCTCA AAGCAAGCCA
AAGTTTAGGG AAATTGAAAT GTCACTTCAA AAGCTTTTAG CACTTTGCTA TGTAACCGAT
GACCTTTTAC AAGATACTAC AGCACTTGAA GCTATAGTAA GGCAAGCTTA TGCAGATGAA
ATGAGTTTTA AAATAGATGA TGCAATCATT AATGGTACTG GTGTTGGAAT GCCCCTTGGA
ATATTAAACT CTGATGCATT AGTTACAGTA CCCAAGGAAA AAGATCAAGG AGCAGGAACA
ATTAAGTATG AAAATATACT TAAAATGTGG AGTTCAATGC CTGCAAGACT TAGAGCAAAT
GCAGTATGGT ATATAAATCA AGAGATAGAA CCACAGCTTT ACACTATGGC TCTTAATATT
GGAGCTGGTG GAGCACCTGT GTTTATGCCT TCCGGTGGAG CTGCAGCATC ACAGTACAGT
ACCTTACTTA ATAGACCAAT AATTCCAATA GAGCAGTGTT CACCTCTTGG TAAAAAGGGA
GATATTATTT TAGCTGACCC AACCCAGTAT ATTGGAATAG ATAAAAAAGG TTTAACTTCT
GATGTATCTA TCCATGTAAG ATTTTTATAT GATGAGCAGG TATTCAGATT CATCTATAAG
TTCAATGGAA TGCCTTATAA GAATAAGCCA ATTATGCCTT ACAAGGGTGC AAATCCACTA
AGTCCTTTTG TAACTTTAGC AGATAGGTAG
 
Protein sequence
MYMSEKMKEL LAQLSNLETE SKNLINKEEA TADEINAKLS EIKALKAKIE AQKEIDALNA 
EREKQAKTPV NEPIYAQPKN HNEKKWKCMG EFLSAVAKAS SPGGRMDNRL TYQNSATGLN
ESIASEGGFL LENEFINDLF ESMMAQSQVA NRIRMIPIGA NTNRLRALGI DENSRANGSR
WGGVQAYWVA EAETAAQSKP KFREIEMSLQ KLLALCYVTD DLLQDTTALE AIVRQAYADE
MSFKIDDAII NGTGVGMPLG ILNSDALVTV PKEKDQGAGT IKYENILKMW SSMPARLRAN
AVWYINQEIE PQLYTMALNI GAGGAPVFMP SGGAAASQYS TLLNRPIIPI EQCSPLGKKG
DIILADPTQY IGIDKKGLTS DVSIHVRFLY DEQVFRFIYK FNGMPYKNKP IMPYKGANPL
SPFVTLADR