Gene Ndas_3355 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagNdas_3355 
Symbol 
ID9247219 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameNocardiopsis dassonvillei subsp. dassonvillei DSM 43111 
KingdomBacteria 
Replicon accessionNC_014210 
Strand
Start bp4009129 
End bp4010439 
Gene Length1311 bp 
Protein Length436 aa 
Translation table11 
GC content70% 
IMG OID 
Productextracellular solute-binding protein family 1 
Protein accessionYP_003681267 
Protein GI297562293 
COG category 
COG ID 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones22 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones18 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCCAGTGC ATTCACACCA GAAGGGGTAC ACAGTGCGCG CATGCGCTGC CGGGCTCGCC 
GCACTGAGCC TCACCCTGGC CACCGCGTGC GGCGAACCCC CAGGGGAGGA GACCGCGGGG
GAGGGCGGCG AGACCGCGGG CCTGGCCGAG GACCCGGCGT TCGTCTGGGC GGTCACCGGG
GCCGACCGCG CGATCCACGA GGAGGTCGCG CGCCTGTGGA ACGAGAACAA CCCCGACCAG
CAGGTCGACA TCTTCTTCCT GGCCCCCACC GCCGACGAGC AGCGCCAGGC CATGTTCCAG
GACCTCCAGA ACCAGGCGGG GGAGTTCGAC GTCCTGGGCC TGGACGTCAT CTGGACCGGT
GAGTTCGCCG AGTACGGCTA CGTCGAGAGC CTGGAGGACC TGCGCGGCGA GGTCGAGGGC
GTCAGCCTGG AGGGCGCCGT CGACAGCTCC CAGTGGCAGC AGGAGCTGTT CGCGCTGCCC
TACTCCTCCA ACGGCGCGTT CCTCTACTAC CGCACCGACC TCGTCGAGGA GCCCCCCACG
ACCTGGGAGG AGCTCTACGA CACCGGCATG GCGGCCGCCG AGGAGGAGGG CATCTCCGCC
TACGTCGGCC AGGGCGACCA GTACGAGGGC TTCGTCGTCA ACTTCCTGGA GCTGTACTGG
TCCGCGGGCG GCCAGCTGTT CGACGACGCC CAGGAGCAGA GCACCTTCCT GGAGGGCGAC
GCCGCCACCA CGGCCCTGGA CTTCATGACC GAGGCCTACG AGAGCGGGTT CTACGCCGAC
GGCTTCGACA CCATGGTCGA GGACGACGCG CGCGCCCTGT TCCAGGCGGG CGAGGCCGTC
TACATGCGCA ACTGGCCCTA CGCCATCCCG CTCCTGGCGG GCGAGGGCGA CGAGGAGAGC
GCGGTCGCGG ACGACTTCGC GGTCGCCCCG CTGCCCACCT TCACCGGTGA GGGCACCACC
AGCGCGCTGG GCGGTCTCAA CAACGCGGTC AGCACGCTGA GCGACAGCAA GGAGCTGGCC
CGCGAGTTCG TCCTGTGGGC GGCCACCGAC CCCGAGGCGC AGGACATCCT GCTCCAGAAC
AGCCTGCCGC CGACCATGGC GAGCGCCTAC GAGAACACCG ACGACCCCGA CTTCCAGATG
CTCGGCGACA TCCTCGCCCA GGCCCAGGCC CGCCCGCCGG TGCCCGGCTA CAACTCGCTG
TCCCTGGCCG TGCAGGACAA CCTGCACCCG GCCTTCCGGG GCCAGGAGGA GTCCGGCACG
GCCCTGGAGG CGGTGGACCA GGCGGCCAAC GACGCGCTCG AACAAGAGTA G
 
Protein sequence
MPVHSHQKGY TVRACAAGLA ALSLTLATAC GEPPGEETAG EGGETAGLAE DPAFVWAVTG 
ADRAIHEEVA RLWNENNPDQ QVDIFFLAPT ADEQRQAMFQ DLQNQAGEFD VLGLDVIWTG
EFAEYGYVES LEDLRGEVEG VSLEGAVDSS QWQQELFALP YSSNGAFLYY RTDLVEEPPT
TWEELYDTGM AAAEEEGISA YVGQGDQYEG FVVNFLELYW SAGGQLFDDA QEQSTFLEGD
AATTALDFMT EAYESGFYAD GFDTMVEDDA RALFQAGEAV YMRNWPYAIP LLAGEGDEES
AVADDFAVAP LPTFTGEGTT SALGGLNNAV STLSDSKELA REFVLWAATD PEAQDILLQN
SLPPTMASAY ENTDDPDFQM LGDILAQAQA RPPVPGYNSL SLAVQDNLHP AFRGQEESGT
ALEAVDQAAN DALEQE