Gene Namu_3049 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagNamu_3049 
Symbol 
ID8448662 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameNakamurella multipartita DSM 44233 
KingdomBacteria 
Replicon accessionNC_013235 
Strand
Start bp3359248 
End bp3360204 
Gene Length957 bp 
Protein Length318 aa 
Translation table11 
GC content66% 
IMG OID645042132 
ProductCRISPR-associated protein Cas1 
Protein accessionYP_003202374 
Protein GI258653218 
COG category[L] Replication, recombination and repair 
COG ID[COG1518] Uncharacterized protein predicted to be involved in DNA repair 
TIGRFAM ID[TIGR00287] CRISPR-associated endonuclease Cas1
[TIGR03638] CRISPR-associated endonuclease Cas1, ECOLI subtype 


Plasmid Coverage information

Num covering plasmid clones14 
Plasmid unclonability p-value0.0000891408 
Plasmid hitchhikingYes 
Plasmid clonabilityhitchhiker 
 

Fosmid Coverage information

Num covering fosmid clones
Fosmid unclonability p-value0.000905677 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
ATGAGGAAGA TCCCCGGCAC TCGTCCGCCC GAACTTCCCG AGCTCGTCCG CGCGCAGGAC 
CGGATCTCCT TCGTCTACCT CGAACGATGC ATCGTCCACC GCCAGGACAA CGCGATCACG
GCGACGGACG AACGAGGGAC CGTCCATCTG CCGGCGGCCA CCCTCGGCGC CCTGCTCCTA
GGTCCAGGAA CCCGGGTCAG TCACCAAGCG ATGGTGCTGC TGGCCGAGTC CGGTTCCACG
GCAGTGTGGG TCGGCGAGCG GGGCGTCCGC TACTACGCGC ATGGTCGCAG CCTGGCTCGC
TCGTCGCGGC TGCTGGAGGC GCAGGCCGCG ATCGTGAGCA ATCAGCAGCG CCGATTGGCC
GTCGCCCGGG CCATGTATGC CATGCGATTT CCCGGTGAGG ATGTCGAAGG CCAAACGATG
CAACAGCTTC GGGGTCGCGA GGGTGCCCGT GTCCGGCGCG TGTACCGATC GATGTCCGCC
GAGACCGGTG TGGTTTGGGA CAAACGGGAC TACAACAGCG AGGATTTTGC GTCCGGGACG
CTCATCAATC AGGCGCTCTC GGCGGCCCAC ACCTGCTTGT ACGGGATTGT GCACGCGGTG
ATCGTCGCCC TCGGTTGCTC GCCGGGCCTC GGGGTGGTCC ACACCGGACA CGTTCGGTCA
TTCGTCTTTG ACATCGCCGA TCTCTACAAG GCCGAAATTT CCATCCCGGT GGCCTTCCGA
GTTGCAGCCA CTGAACCCGA GGACGTGGGC GCGGAGACGC GACGGGCGGT TCGCGACGCC
GTGCACGACG GCAAGATCCT CGCCCGTTGC GCCCGAGACA TCCGTCAACT CCTCTTGCCG
GACCAGGATC CGGTCGAGGA CGACGTGGAC GCCGACGTCA TCAATCTCTG GGACGGCGAT
GATCGGGTCG TGTCCGGAGG GACCGGCTAC GTGGAGTCGG ACACATGGTC GTTCTGA
 
Protein sequence
MRKIPGTRPP ELPELVRAQD RISFVYLERC IVHRQDNAIT ATDERGTVHL PAATLGALLL 
GPGTRVSHQA MVLLAESGST AVWVGERGVR YYAHGRSLAR SSRLLEAQAA IVSNQQRRLA
VARAMYAMRF PGEDVEGQTM QQLRGREGAR VRRVYRSMSA ETGVVWDKRD YNSEDFASGT
LINQALSAAH TCLYGIVHAV IVALGCSPGL GVVHTGHVRS FVFDIADLYK AEISIPVAFR
VAATEPEDVG AETRRAVRDA VHDGKILARC ARDIRQLLLP DQDPVEDDVD ADVINLWDGD
DRVVSGGTGY VESDTWSF