Gene Ndas_1039 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagNdas_1039 
Symbol 
ID9244885 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameNocardiopsis dassonvillei subsp. dassonvillei DSM 43111 
KingdomBacteria 
Replicon accessionNC_014210 
Strand
Start bp1279894 
End bp1281405 
Gene Length1512 bp 
Protein Length503 aa 
Translation table11 
GC content74% 
IMG OID 
Productextracellular solute-binding protein family 5 
Protein accessionYP_003678988 
Protein GI297560014 
COG category 
COG ID 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones13 
Plasmid unclonability p-value0.78614 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones10 
Fosmid unclonability p-value0.0480029 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAGACGAC GATCCCTCAT GGCCGGTGCG TGCGCACTGG CCCTGGTCGC GACCGGCTGC 
GGCGGCGGAG GGGAGCAGAC CGCCGCCTCC GAGGAAGCCG CCCTGCGCTA CGCCGCGGTG
GGGGCGCCCG CCGCCACCAC CCACGACCCG CACGGACGGG TCGGCAACGA GGCCGACTAC
CTGCGCTTCG CCATGCTCTA CGACGTCCTC ACCGTCACCG ACGAGCAGGG CGCGGTCCAG
CCGCGCCTGG CCACGGCGTG GGAGCCGGTG GACGGCGACC TCACCCGCTG GAGCGTCACC
CTGCGCGACG ACGCCGCCTT CTCCGACGGC CAGCCCGTCA CGGCCGACGA CGTCCTGTTC
TCGCTGCGCC GCATCCAGGG CAAGGGCGCG GAGAACAACG GCCGCCTGTC CATGTTCGAC
CTGGAGGCCT CCAGCGCCAC CGGCGAGCAC GGGCTGGAAC TGGTCACCCG CCAGCCCTAC
GCCGAGGTCG GCCTGGCCCT GGCCTCCCTC ACCTTCGTCG TGCCCGAGGG CAGCGAGGAC
ATCACCGAGC CCGTCGCGGG CTCGGGCCCC TTCGTCCTGG ACGAGGGCGA CGACACCACC
TCCGTCCTGA GCCGCAACGA CGACTGGTGG GGTGAGGCGC CCTCCTACGA GAGGCTGGAG
ATCACCGCCA TGCCCGACCC GGCCGCGCGC GCCGCCGCCG TCGCCTCCGG GCAGGCCGAC
GTCGCCGGGA GCGTCGCCCC GGCCACCGCC GAGCAGTACG CCGACGGCGG TGAGGTCGAG
GTGGTCACCC GCCCCGGCGG CGTCAACTAC CCGCTGGTCA TGGACCTGGA GACCGAGCCC
TTCGACGACC CCGACGTCCG CGAGGCGGTC AAGCTCGCCC TGGACCGCGA GCAGCTCGTG
GAGACCGTCT TCCTGGGCTA CGGGGAGCCC GGCGCCGACC TCCTCAGCCC CCTGGAGCCC
TTCGCCCCGG ACGCGCCCCC GGTGGAGCGC GACCTGGACC GGGCGCGCGA ACTGCTGGAG
GAGGCCGGGC ACGGCGGCGG GGTGTCCCTC ACCCTGCACG CCACCAATTC CTACCCTGGC
ATGGAGGAGG CCGCGGTGCT GGTCTCCGAG CAGCTCGCCG AGGCCGGGAT CGAGGTCGAG
GTCGAGGTGG GAGCCCCCGA CACCTACTGG ACCGAGGTCT GGAACGTCGA ACCCTTCTAC
CTCAACAGCC TCGGCGGCAA CGGCTTCGTG GACTTCTCCC GGATGGCGCT GCTCGCGGAC
GGCCCCATCA ACGAGACCGG CTGGAACGAC CCCGAGTGGG ACGCCGCCTT CAACGACGCG
CTGGCCACCG CCGACGAGGC CGAGCGCCAC GCGGCCCTGG GCGGACTCCA GCAGCGCATC
GCCGACGAGG GCGGCTACGT GGTCTGGGGC GTGGGCGACG GGATCGACCT CACCGCGCCC
GGCGTGACCG ACCTGCCCAC CGGCCCCGGG TTCCACCGGC TGCGCGTCGA ACAGGTCGGG
GTGCCGGGCT GA
 
Protein sequence
MRRRSLMAGA CALALVATGC GGGGEQTAAS EEAALRYAAV GAPAATTHDP HGRVGNEADY 
LRFAMLYDVL TVTDEQGAVQ PRLATAWEPV DGDLTRWSVT LRDDAAFSDG QPVTADDVLF
SLRRIQGKGA ENNGRLSMFD LEASSATGEH GLELVTRQPY AEVGLALASL TFVVPEGSED
ITEPVAGSGP FVLDEGDDTT SVLSRNDDWW GEAPSYERLE ITAMPDPAAR AAAVASGQAD
VAGSVAPATA EQYADGGEVE VVTRPGGVNY PLVMDLETEP FDDPDVREAV KLALDREQLV
ETVFLGYGEP GADLLSPLEP FAPDAPPVER DLDRARELLE EAGHGGGVSL TLHATNSYPG
MEEAAVLVSE QLAEAGIEVE VEVGAPDTYW TEVWNVEPFY LNSLGGNGFV DFSRMALLAD
GPINETGWND PEWDAAFNDA LATADEAERH AALGGLQQRI ADEGGYVVWG VGDGIDLTAP
GVTDLPTGPG FHRLRVEQVG VPG