Gene Cthe_3218 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCthe_3218 
Symbol 
ID4809520 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameClostridium thermocellum ATCC 27405 
KingdomBacteria 
Replicon accessionNC_009012 
Strand
Start bp3816440 
End bp3817435 
Gene Length996 bp 
Protein Length331 aa 
Translation table11 
GC content37% 
IMG OID640108652 
ProductCRISPR-associated Cas1 family protein 
Protein accessionYP_001039606 
Protein GI125975696 
COG category[L] Replication, recombination and repair 
COG ID[COG1518] Uncharacterized protein predicted to be involved in DNA repair 
TIGRFAM ID[TIGR00287] CRISPR-associated endonuclease Cas1 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.0000280385 
Plasmid hitchhikingYes 
Plasmid clonabilityhitchhiker 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAAGCTGA TTCTGAATAC ACCCGGACTT TACCTTTGTA AAAGGGGAGA GTGCTTTCAG 
ATTCAAAGCG AGAATGAAAA ACGGGAAATA GCTGCAACTA AAGTTGACCA AATAATGATA
ACCACTCATG CAGCTCTTAC AACTGATGCA ATTGAACTGG CTCTTGAATA CAATATTGAC
ATAATATTTT TAAAAAACAC GGGACAACCA ATGGGGCGTG TATGGCATTC AAAACTTGGA
AGTATCAGTA CAATTAGAAG AAAACAATTA TTCCTCCAGG ATAGCCCGCT TGGGCTTCAA
CTTGTAAAAG AATGGATCTT GGAGAAAATG GATAATCAAA TACGGCTGTT AAAGAAACTG
GAAGTTAACC GCAGAGATGA TGAAAAACGT GCTATAATCA GGGATACCAT AGAAAAGATT
GAAAAGCAAA AAGCTAACAT TATGTCGATT AACAATAAAG AAACAGTGAA CAATGTAAGA
AATATGCTTC TTGGTTATGA AGGAACTGCC GGAAGAGTAT ATTTTGAAAC CCTTGGCAAG
TTGATTCCTG AGAAATATGC TTTTGAAGCG AGAAGCCGGA ATCCTGCGAA AGATCCTTTC
AACTGTATGC TCAACTATTC CTACGGTATT TTATATTCCA GCGTTGAAAA AGCCTGCATA
ATTGCAGGAT TAGACCCATA CATTGGCATT ATGCATACCG ACAATTATAA TAAGAAAGCT
CTTGTATATG ATATGGTTGA AATGTACAGA GGATATATGG ATGAAATAGT TTTCAGGCTG
TTTAGCACAA AGAAAGTTCA AGACGATTTT TTTGACAAGA TTGAAGATGG TTACTATCTC
AATAAAGAAG GAAAGCAACT GCTAATATCC GAGTATAACA AAGAGCTGGA AGTCAAAATG
AATTACAGAG GAAGAAGAAT AGAATTTGCC AACATAATCC AGTACGACTG CCATCAGATT
GCAAATCGCA TACTGAAGGA GGATATACCA TGTTGA
 
Protein sequence
MKLILNTPGL YLCKRGECFQ IQSENEKREI AATKVDQIMI TTHAALTTDA IELALEYNID 
IIFLKNTGQP MGRVWHSKLG SISTIRRKQL FLQDSPLGLQ LVKEWILEKM DNQIRLLKKL
EVNRRDDEKR AIIRDTIEKI EKQKANIMSI NNKETVNNVR NMLLGYEGTA GRVYFETLGK
LIPEKYAFEA RSRNPAKDPF NCMLNYSYGI LYSSVEKACI IAGLDPYIGI MHTDNYNKKA
LVYDMVEMYR GYMDEIVFRL FSTKKVQDDF FDKIEDGYYL NKEGKQLLIS EYNKELEVKM
NYRGRRIEFA NIIQYDCHQI ANRILKEDIP C