Gene Ndas_3999 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagNdas_3999 
Symbol 
ID9247871 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameNocardiopsis dassonvillei subsp. dassonvillei DSM 43111 
KingdomBacteria 
Replicon accessionNC_014210 
Strand
Start bp4782517 
End bp4783875 
Gene Length1359 bp 
Protein Length452 aa 
Translation table11 
GC content71% 
IMG OID 
Productextracellular solute-binding protein family 1 
Protein accessionYP_003681902 
Protein GI297562928 
COG category 
COG ID 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones11 
Plasmid unclonability p-value0.220113 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones13 
Fosmid unclonability p-value0.250177 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
GTGCACACCA CCTACGCCGT CCGGGCGACC GCCGTCGCCG CCTGCGGACT GCTGCTCACC 
GGGTGCGCCG GGACGGGCGC CCTCCCCTCC GAGGACGGCG CCGTCCGGCT GACCGTCGCG
ATCGTGTCCA ACCCGCAGAT GCAGGACGCG ATCTCCCTGG AGTCGCGGTT CCGCGCGGAG
CACCCCGGCA TCGCGCTCGA CTTCGTGTCC CTGCCCGAGA ACGAGGCCCG CGCCAAGATC
ACCACCTCCG TGGCCACCGG CGGGGGCGAG TTCGACGTCG TCATGATCAG CAATTACGAG
ACCCGCCAGT GGGCCGAGTA CGGCTGGCTG GAGAACCTGC AACCCTCCAT CGACGCCGCC
GGGGGCTACG ACCACGAGGA CTTCATCCCC TCCATCAGGG AGGACCTGTC CCACGAGGGC
GACATGTACT CGGTGCCCTT CTACGGCGAG TCCTCCTTCC TCGCCTACCG CAAGGACCTG
TTCGAGCAGG CCGGGGTGGA GATGCCGCCC GACCCCACCT GGGAGGAGGT GGCGGACCTG
GCCGCCGAGC TCGACGGCGT GGAGCCGGGG GTCTCGGGGA TCTGCCTGCG CGGCCTCGCG
GGCTGGGGCG AGGTGCTGAG CCCCTTCAAC AGCGTCCTGA ACACCTTCGG CGGGCGCTGG
TACGACGAGG ACTGGAACGC CGAGATCGAC TCCCCCGAGT TCCGGCGCGC GGCCGAGTTC
TACGTGGGCC TGGCGCGTGA GCACGGCCAG CCGGGCGCGG CCAACAGCGG GTTCGGGGAC
TGCCTGAACC GCTACTCCCA GGGCCGGGCG GCCATGTTCT ACGACTCCAC CTCCATGGTC
AGCACCATCG AGGACCCGGA CTCGGCGACC GTGGCCGGGC TCAACGGCTA CGCCGCGGCG
CCGGTGGCCG AGACCGACTA CGGCGGCTGG CTCTACACCT GGGCGCTGGG CGTCCCCTCC
ACCTCCGAGC ACAAGGAGGA GGCGTGGGCG TTCCTGGAGT GGATGACCGA CAAGGACTAC
GTGCGCACCG TCGCCGAGGA GTACGGCTGG CAGCGGGTGC CGCCCGGCAA CCGGCTCTCC
ACGTTCGAGG TCCCCGAGTA CCGGGAGGCC GCCCGGGCCT ACGCCGAGCC CATGCTCCAG
GGCATCCAGG AGGCGGACCC CGAGGACCCG GGCACGCGCC CGGTCCCCTA CGAGGGCATC
GGCTTCCTCG CCATCCCCGA GTTCCAGGAC CTGGGCACCC GGGTCAGCCA GCAGCTGAGC
GCGGCCATAG CCGGGCAGAT CACCGTCGAG CAGGCGCTCG AACAGAGCCA GGAGTACGCC
GAGGTCGTCG GTGGGACCTA CAGGGAGGAC GACCGATGA
 
Protein sequence
MHTTYAVRAT AVAACGLLLT GCAGTGALPS EDGAVRLTVA IVSNPQMQDA ISLESRFRAE 
HPGIALDFVS LPENEARAKI TTSVATGGGE FDVVMISNYE TRQWAEYGWL ENLQPSIDAA
GGYDHEDFIP SIREDLSHEG DMYSVPFYGE SSFLAYRKDL FEQAGVEMPP DPTWEEVADL
AAELDGVEPG VSGICLRGLA GWGEVLSPFN SVLNTFGGRW YDEDWNAEID SPEFRRAAEF
YVGLAREHGQ PGAANSGFGD CLNRYSQGRA AMFYDSTSMV STIEDPDSAT VAGLNGYAAA
PVAETDYGGW LYTWALGVPS TSEHKEEAWA FLEWMTDKDY VRTVAEEYGW QRVPPGNRLS
TFEVPEYREA ARAYAEPMLQ GIQEADPEDP GTRPVPYEGI GFLAIPEFQD LGTRVSQQLS
AAIAGQITVE QALEQSQEYA EVVGGTYRED DR