Gene Saro_3035 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagSaro_3035 
Symbol 
ID3916647 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameNovosphingobium aromaticivorans DSM 12444 
KingdomBacteria 
Replicon accessionNC_007794 
Strand
Start bp3247022 
End bp3249055 
Gene Length2034 bp 
Protein Length677 aa 
Translation table11 
GC content68% 
IMG OID640445815 
Productsulfotransferase 
Protein accessionYP_498304 
Protein GI87201047 
COG category[R] General function prediction only 
COG ID[COG4783] Putative Zn-dependent protease, contains TPR repeats 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones19 
Plasmid unclonability p-value0.530133 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGGTCCAAC AACTTCGCCC GCCGCCCTCG CCAGCCGCCG CCCTGGCCGA GGCGGAGCAG 
TTGCTCGCCA CCCGCCCCCA CCTCGCGCTG ACCCGCGCCG AAGCGATGCT GCGGCAGGTC
CCCGCCCATC CCCCCGCCCT GTTCCTCGCC GCCCGCGCCC TGCGCCGGAT GGGCAGGCAG
CGCGAGGCGC TGGACCGGCT CGATGCGCTG GCAAAGGCAA ATCCCCGCGT ACCGGCGGTC
CTCTACGAAC TCGGGCAGGT GGCGGCGGAC CTTGGCGACC AGCGGCGCGC AACCGCCGCG
CTCCAGGCGC TTGTCCGCCT TCAGCCCGCC ATTGCCTCGG GCTGGTTCCT GCTCGCCAGG
CAACTGCGCA AGGCCGGGCA AGAGGAGGAC GCATGGCGTG CCGACCTTTC CGGCATCCAC
GCCTCCTCGC GCGATGGCGA ACTGCTGAAG GCGGCGGCGG CGATGAACGA CGGCGAAACC
GATGCCGCCG AAGCCCTGCT GAATGCGCGA CTGGAACGGC AGGCGGACGA TGCCCCCGCC
TTGCGCCTCC TCGGTGAAAT CGCATGGCGG CGCGGCGACA TGACCACGGC GCTCGACCGC
GCGGCGCGGG CGGTCGATCT GGCGCCGGGG TTCGACCTTG CCCGCGACTT CCTGATCCGC
CTCCTGCTCC AGACCAACCG CCTGTCCGAA GCGCTCGACC ATGCCGAGAC GCTGGTGCGC
TCGCCCGTTC CGTCACCGGG GCACAGGCTG ATCCTTGCCT CGGTCCTCGT CCGCCTCGGC
CACCAGGAGC GCGCGGCAGC GATCTATCGA GAGCTTCTGG CCGAACGGCC CGACGAGCCG
CAGGTCTGGC AGAACCTCGG CCACGTGCTG AAGACGCTGG GCCACCAGGA CGAGGCGGTC
GAGGCCTATC GCGCCGCCGT CTCGCGTCAA CCGACCATGG GTGAAGCGTG GTGGAGCCTT
GCCAACCTCA AGACGGTCCG CCTGGGCGCC GAAGACGTCG CGGCAATGGA AGCGGCGCTG
GCCTCGCTCG ACGATGCGGT GGATGAGCGC AAGGACGACG TATTCCACCT CCATTTTTCG
CTGGGCAAGG CATTCGAGGA CACCGGCGAT CATGCCGCCG CCTTTGCCCA TTACGACAAG
GGCAATGCCT TGCGTCGTAC CATGATCCGC CACGATGCCG ATGCCTTTTC GGCCCAGGTG
GACGCTACGG CAGCGACCTT CACTGCCGCG TTCCTTGCGG GCATGGGAGA AGGCGGTTGC
CCCGCGCCCG ACCCGATCTT CGTCGTCGGC CTGCCCCGGT CGGGCTCGAC CCTGGTCGAA
CAGATCCTGT CCAGCCACAG CCAGGTCGAA GGCACGATGG AACTGCCCGA GATGATGATG
ATCGCCGCGC GCCTGCAATC GCGCGTCGAC GAGGGCGAAT TTCCGGATTT CGCCGCGATG
GTCGCCTCGC TGTCGCCCGC CGACCGCGCC AGGCTCGGTG ACGAATACAT CGAGCGCACC
CGCATTCATC GCCAGACCGA CCGGCCCCTG TTCATCGACA AGATGCCCAA CAACTGGCAG
CACGTCGGCC TGATCCGGCT GATCCTGCCG AACGCCAACG TGATCGACGC GCGAAGGCAT
CCGCTGTCCT GCTGCTTTTC CGGGTGGAAA CAGCATTTCG CGCGCGGCCA GACCTTCACC
TACGACCTTG CCGACATCGG TCGCTACTAT CGCGATTACG TCGGCCTGAT GGCAGCGTGG
GACGCCCATT TTCCCGGTGC GGTCCACCGC GTGATCTACG AACGGATGGT GGCGGATACC
GAAAACGAGG TCCGCCGCCT GCTCGACCAT CTCGGCCTGC CGTTCGAACC GGCCTGCCTG
GAATTCTACC GCAACGAGCG CGCAGTGCGC ACGGCCAGTT CCGAACAGGT GCGGAAGCCG
ATCTTCCGCG ACGGACTCGA AGCGTGGAAA CCCTATGAAC CCTGGCTCGC CCCCCTGAAG
GCCGCACTTG GCCCCGTGCT GGACAGCTAT CCCGACGCGC CCTCCGCACG CTGA
 
Protein sequence
MVQQLRPPPS PAAALAEAEQ LLATRPHLAL TRAEAMLRQV PAHPPALFLA ARALRRMGRQ 
REALDRLDAL AKANPRVPAV LYELGQVAAD LGDQRRATAA LQALVRLQPA IASGWFLLAR
QLRKAGQEED AWRADLSGIH ASSRDGELLK AAAAMNDGET DAAEALLNAR LERQADDAPA
LRLLGEIAWR RGDMTTALDR AARAVDLAPG FDLARDFLIR LLLQTNRLSE ALDHAETLVR
SPVPSPGHRL ILASVLVRLG HQERAAAIYR ELLAERPDEP QVWQNLGHVL KTLGHQDEAV
EAYRAAVSRQ PTMGEAWWSL ANLKTVRLGA EDVAAMEAAL ASLDDAVDER KDDVFHLHFS
LGKAFEDTGD HAAAFAHYDK GNALRRTMIR HDADAFSAQV DATAATFTAA FLAGMGEGGC
PAPDPIFVVG LPRSGSTLVE QILSSHSQVE GTMELPEMMM IAARLQSRVD EGEFPDFAAM
VASLSPADRA RLGDEYIERT RIHRQTDRPL FIDKMPNNWQ HVGLIRLILP NANVIDARRH
PLSCCFSGWK QHFARGQTFT YDLADIGRYY RDYVGLMAAW DAHFPGAVHR VIYERMVADT
ENEVRRLLDH LGLPFEPACL EFYRNERAVR TASSEQVRKP IFRDGLEAWK PYEPWLAPLK
AALGPVLDSY PDAPSAR