Gene Ndas_2036 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagNdas_2036 
Symbol 
ID9245886 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameNocardiopsis dassonvillei subsp. dassonvillei DSM 43111 
KingdomBacteria 
Replicon accessionNC_014210 
Strand
Start bp2457296 
End bp2458615 
Gene Length1320 bp 
Protein Length439 aa 
Translation table11 
GC content70% 
IMG OID 
Productextracellular solute-binding protein family 1 
Protein accessionYP_003679968 
Protein GI297560994 
COG category 
COG ID 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones10 
Plasmid unclonability p-value0.182591 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones16 
Fosmid unclonability p-value0.899466 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCACGCAC ACCCTGGCAA CGGGCGCCCC CAGGCCCGTC GCCGCAGCCC TTTCCTCTCC 
GTGGCCGCGG GAGTGGGCGC CGTCGTCATG GCGGCCACCG CCTGTGGCGG CTCCGAGCCC
GACGACGGCA CGGTCGAGCT GCGCTTCTCC TGGTGGGGCG CGGACGACCG CCACTCCACC
ACCCAGCAGG TCATCGACCT GTTCGAGGCC GAGAACCCGG GCATCACCAT CGTCCCCGAG
TACACCGACT GGGCCGGGTA CTGGGACCGC CTGGCCACCA GCACCGCTGC CAACGACGCC
CCCGACATCA TCACGCAGGA GGAGCGCTAC CTGCGCGAGT ACGGCGACCG CGGCGCCCTG
CTCGACCTCA ACGAGGTCGA GTTGGACCTG TCGGGGATCG ACCCCCTGGT GGCCGAGAGC
GGCGACCTGG ACGGGCAGAC CTTCGGTGTG GCCACCGGCG TGAACGCCTA CGCGATCCTG
GCCGACCCCC AGGCCTTCGA GGACGCGGGC GTGGAGATGC CCGACGACGA GACGTGGACC
TGGGACGACT ACATCGAGAT CTCGGCCCAG ATCAGCGAGG CCACCGACGG CGAGGTCGTG
GGCACGCAGA GCATGTCCTA CAACGAGACG GGCTTCCAGA TCTTCGCCCG CCAGCGCGGG
GAGAACCTCT ACGCCGAGGA CGGATCCCTC GGCTTCTCCC AGGAGACGCT GGAGGAGTGG
TTCTCCGTCA CCGAGCAGCT GGTGGAGAAC GGCGGCCAGC CCGGCGCGGC CGAGAGCGTG
GAGATCGAGG CGGGCGGCCC CGACCAGTCG GTGCTCTCCA CCAACCAGGG CGCGATGGCC
CACTTCTGGA CCAACCAGCT GGGCGGCATC TCGGCCTCCT CCGGCCGCGA CATCGAACTG
CTGCGCTACC CGGGTGAGAG CACCGAGGAC CGCACCGGCA TGTTCTTCAA GCCGGCCATG
TTCTACTCCA TCTCCGCGGG CACCGAGCAC CCGGAGGAGG CGGCGCTGTT CGTGGACTTC
CTGCTCAACA GCGAGGAGGC GGCCGAGCTG ATCCTGGCCG ACCGCGGTCT GCCCGCCAAC
GTGGACGTGC GGTCCCACAT CATCGACTCT CTGCCCGAGG CCGACGCCCG CAGCGCGGTG
TTCCTGTCCG AGATCGAGGG CACGATCGTG GACGGCAACC CGCCGCCGCC GATCGGCGCG
GGGCAGGTCG TGGACATCAC CAAGCGCGTC ACCGAGGACC TGACCTTCGG CGAGCTGACC
CCGGCCGAGG CCGCGGAGCA GTTCATGTCG GAGGTCGAGG CGGCCACCGG CGAGGCCTGA
 
Protein sequence
MHAHPGNGRP QARRRSPFLS VAAGVGAVVM AATACGGSEP DDGTVELRFS WWGADDRHST 
TQQVIDLFEA ENPGITIVPE YTDWAGYWDR LATSTAANDA PDIITQEERY LREYGDRGAL
LDLNEVELDL SGIDPLVAES GDLDGQTFGV ATGVNAYAIL ADPQAFEDAG VEMPDDETWT
WDDYIEISAQ ISEATDGEVV GTQSMSYNET GFQIFARQRG ENLYAEDGSL GFSQETLEEW
FSVTEQLVEN GGQPGAAESV EIEAGGPDQS VLSTNQGAMA HFWTNQLGGI SASSGRDIEL
LRYPGESTED RTGMFFKPAM FYSISAGTEH PEEAALFVDF LLNSEEAAEL ILADRGLPAN
VDVRSHIIDS LPEADARSAV FLSEIEGTIV DGNPPPPIGA GQVVDITKRV TEDLTFGELT
PAEAAEQFMS EVEAATGEA