Gene Ndas_4194 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagNdas_4194 
Symbol 
ID9248068 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameNocardiopsis dassonvillei subsp. dassonvillei DSM 43111 
KingdomBacteria 
Replicon accessionNC_014210 
Strand
Start bp5008669 
End bp5009982 
Gene Length1314 bp 
Protein Length437 aa 
Translation table11 
GC content73% 
IMG OID 
Product1, 4-beta cellobiohydrolase 
Protein accessionYP_003682093 
Protein GI297563119 
COG category 
COG ID 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones12 
Plasmid unclonability p-value0.28585 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones22 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGTCCCGCA CACGCATAGC CGTCGGAGCA GCAGTGAGCT CCGTCTCGGC ACTGGCCCTG 
GGAACGGCGC TGCTCGCCAC GGCGCCCGCC TCGGCCGCCG ACTCCGAGTT CTACGTCAAC
CCCAACACGT CGGCGGCCGT CTGGGTCGAG GAGAACCCGA ACGACCCCCG GGCCGACGTC
ATCCGCGACC GCATCGCCTC GGTCGCCCAG GCCACCTGGT TCACCCAGTA CAACCCCGCC
GAGGTCCGCG ACGACGTGGA CGCGGTGGTC AGCGCCGCCG ACGCCCAGGG CCAGACCCCC
ATCCTGGTGG TCTACAACAT CCCCGGCCGC GACTGCGGCA ACCACAGCGG CGGCGGGGCG
CCCAGCCACG ACGCCTACCG CGCCTGGGTC GACGAGGTCG CCGCGGGGCT GGAGGGCCGG
TCCGCCACCA TCGTCCTGGA GCCCGACGCC CTGCCGCTGG TGAGCGGCTG CAGCGACCCG
TCCGAGCTCC TGGACTCCAT GGCCTACGCG GGCAAGGCGC TCATGGAGGG CTCCTCCGAG
GCCAGGGTCT ACTTCGACAT CGGCAACTCG GCCTGGCTGG ACCCGCAGGA GGCCGCCGGC
CTGCTCAACG GCGCGGACGT CGCGAACAGC GCGCACGGCG TCGCCACCAA CACCTCCAAC
TACAACTGGA CCCACGACGA GGTCGCCTTC GCGGAGGCCG TCATCGCCGC GACGGGCGTG
CCCGGCCTCG GCGCCGTGAT CGACACCAGC CGCAACGGCA ACGGCCCCGC CCCCCAGAAC
GAGTGGTGCG ACCCGCCGGG GCGGATGATC GGCCGCCCCA GCACCACCGA CACCGGGAAC
CCGCTGATCG ACGCCTTCAT CTGGACCAAG CTGCCCGGTG AGGCCGACGG CTGCATCGCG
CCCGCGGGGC AGTTCGTGCC CCAGGCCGCC TACGACATGG CGGTGAACGC CCCCGAGTAC
CCCACCGACC CCGGCGAGCC GACCGACCCC GAGGAGCCCA CCGACCCGCC CGAGGGCGAG
GGCTGCACGG CCGACTACAG GGTCGTCAGC GAGTGGGGCA ACGGCTTCCA GGCGGCGGTG
ACGGTCACCG CCGAGGACTC CCTCAGCGGC TGGACCGTGA CGTGGACCTA CGCCGACGGG
CAGCGGTTCA GCCAGGGCTG GAACGCCGAG TTCTCCAGCA GCGGGTCGCG GGTCACCGCC
TCCGACCTCG GCTGGAACGG CACGCTCAGC GCCGGCGGCA GCACCGAGTT CGGGTTCACC
GGGACCCACG GCGGTAGCAA CGGCGTGCCC GAGGTGACGT GCTCCGCGGC CTGA
 
Protein sequence
MSRTRIAVGA AVSSVSALAL GTALLATAPA SAADSEFYVN PNTSAAVWVE ENPNDPRADV 
IRDRIASVAQ ATWFTQYNPA EVRDDVDAVV SAADAQGQTP ILVVYNIPGR DCGNHSGGGA
PSHDAYRAWV DEVAAGLEGR SATIVLEPDA LPLVSGCSDP SELLDSMAYA GKALMEGSSE
ARVYFDIGNS AWLDPQEAAG LLNGADVANS AHGVATNTSN YNWTHDEVAF AEAVIAATGV
PGLGAVIDTS RNGNGPAPQN EWCDPPGRMI GRPSTTDTGN PLIDAFIWTK LPGEADGCIA
PAGQFVPQAA YDMAVNAPEY PTDPGEPTDP EEPTDPPEGE GCTADYRVVS EWGNGFQAAV
TVTAEDSLSG WTVTWTYADG QRFSQGWNAE FSSSGSRVTA SDLGWNGTLS AGGSTEFGFT
GTHGGSNGVP EVTCSAA