Gene Sare_4778 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagSare_4778 
Symbol 
ID5704445 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameSalinispora arenicola CNS-205 
KingdomBacteria 
Replicon accessionNC_009953 
Strand
Start bp5408324 
End bp5410126 
Gene Length1803 bp 
Protein Length600 aa 
Translation table11 
GC content66% 
IMG OID641274176 
Productextracellular solute-binding protein 
Protein accessionYP_001539522 
Protein GI159040269 
COG category[E] Amino acid transport and metabolism 
COG ID[COG0747] ABC-type dipeptide transport system, periplasmic component 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones13 
Plasmid unclonability p-value0.584214 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones
Fosmid unclonability p-value0.0000356618 
Fosmid HitchhikerNo 
Fosmid clonabilitydecreased coverage 
 

Sequence

Gene sequence
ATGACGAGCA GATTTCCCCG CCGCGCCCTG CGTGGCGCGG TGGCGGCAGC CACCACGATC 
GCTCTCGGGG CGGGCCTGGT CGGCTGCGGT GACAGCAGTG GATCGTCCCA AGACGGCGGA
AGTCAGGGCA AGGACACCGT GACGGTCGCC TTGCGTACCC CCAACTGGAT CCTGCCGATC
TCGGCACCCG GATTCACGCA GGGTGAAAAC GCCATCTTCA ACCAGTCGCT CTACCGGCCC
CTGTACCAGT ACCGGCTCGA CGGCACGGCG CAGTACAACA TCGACCCACA ACGCTCGATG
GCCGAGCCAC CGCAGGTGAG CGAGGACGGT CGGACCCTGA CGATCACGCT GAAGGACAAC
ACCTGGTCCG ACGGCAAGCC CATCACCACC AAGGACATCC AGTTCTGGTA CGACCTGGTC
ACGGCGAACA AGGACAAGTG GGCGTCCTAC CGGGCCGGCG GCTTCCCCGA CAACGTCGCG
GAGTGGTCGG TCCAGGACGA GAAGACCTTC TCGATCACCA CCACGAAGGT CTACAACACC
GCGTGGTTCG TCGACAACCA ACTCAACCGC ATCACGCCCC TGCCCCAGCA CGCCTGGGAC
AAGGACTCCG CGACCGCCGA CGTGAGCGAC CTCGCCAGCA GCCCGGAGGG CGCCGAGAAG
GTCTTCGACT TCCTCACCGC CGCCGCGAAG GACCCCAAGA CGTACGACTC CAACGAGTTG
TGGAAGGTCA CCAGCGGCGC GTGGAAGCTG GAGAAGTACG TGCCCAACGG TGAGGTCACC
CTCGCCGCCC AACCGAACTA CTCCGGTACC GACAAACCGA AGCTGGCCAC GGTCGTGTTG
CGCCCGTTCA CCAGCGACGA CGCCGAGTTC AACGTGCTCC GCGCCGGTGA CATCGACTAC
GGGTACGTGC CAGCGGCCAA CCTGTCCCAG GAGAGCTACC TCGAGTCCAA GGGATACACG
GTCTCGCCGT GGTACGGCTG GTCGATCACC TACCTGCAGC TGAACTACAA CAACCCGAAA
ACCGGCGTGC TGTTCAAGCA GCCCTACCTT CGGCAGTCGC TGCAGATGCT CATCGACCAG
CCGACGATCA GCAAGGTCAT CTGGTCGGAC ACCGCCGCGC CGACCTGCGG CCCGGTACCG
GCCAAGCCCG GCACCAACAC CGACGCCGCC GGATGCGCCT ACTCCTTCGA CCCGGCGAAG
GCCAAGGAAC TGCTGGAGAG CCACGGCTGG AAGGTGACCC CGGACGGGCA GACCACCTGC
CAGTCACCGG GCACCGGCCC GAACCAGTGC GGTGACGGAA TCGCCGCCGG CACGGCGCTG
GAGTTCACCG TCACCAGCCA GACCGGGTTC GCCGCCACGA CCAAGATGTT CGCCGAGATC
AAGTCACAGA TGGCCAAGCT CGGCATCCAG CTGACGATCA AGGAGGTGCC GGACTCGGTC
GCGGTCACCC CGGCGTGCGA GCCGACCGAG GGGACCTGCG ACTGGGACAT GTCCTTCTTC
GGCTCGCAGG GCAGCTGGTA CTACCCGGCC TTCGCCAGCG GCGAGCGGCT CTTCGCCACC
GACGCCCCGG TCAACCTGGG CAGCTACAGC AATCCGGAGG CCGACAAGCT CATCGAGGCC
ACCCAGTTCG CTGGCGACGA GAGCGCGCTC ACGGCGTACA ACGACTTCCT GGCCAAGGAC
CTGCCTGTGC TGTGGATGCC GAACCCGGTG TACCAGGTCT CGGCGTACCG CTCCGGCCTG
CAGGGAGTCG AGCCGCAGGA TCCGATGAAT CTCATGTACT TCCAGGACTG GTCCTGGAAG
TAA
 
Protein sequence
MTSRFPRRAL RGAVAAATTI ALGAGLVGCG DSSGSSQDGG SQGKDTVTVA LRTPNWILPI 
SAPGFTQGEN AIFNQSLYRP LYQYRLDGTA QYNIDPQRSM AEPPQVSEDG RTLTITLKDN
TWSDGKPITT KDIQFWYDLV TANKDKWASY RAGGFPDNVA EWSVQDEKTF SITTTKVYNT
AWFVDNQLNR ITPLPQHAWD KDSATADVSD LASSPEGAEK VFDFLTAAAK DPKTYDSNEL
WKVTSGAWKL EKYVPNGEVT LAAQPNYSGT DKPKLATVVL RPFTSDDAEF NVLRAGDIDY
GYVPAANLSQ ESYLESKGYT VSPWYGWSIT YLQLNYNNPK TGVLFKQPYL RQSLQMLIDQ
PTISKVIWSD TAAPTCGPVP AKPGTNTDAA GCAYSFDPAK AKELLESHGW KVTPDGQTTC
QSPGTGPNQC GDGIAAGTAL EFTVTSQTGF AATTKMFAEI KSQMAKLGIQ LTIKEVPDSV
AVTPACEPTE GTCDWDMSFF GSQGSWYYPA FASGERLFAT DAPVNLGSYS NPEADKLIEA
TQFAGDESAL TAYNDFLAKD LPVLWMPNPV YQVSAYRSGL QGVEPQDPMN LMYFQDWSWK