Gene PICST_37138 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagPICST_37138 
SymbolARS1 
ID4840984 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameScheffersomyces stipitis CBS 6054 
KingdomEukaryota 
Replicon accessionNC_009048 
Strand
Start bp201208 
End bp202905 
Gene Length1698 bp 
Protein Length565 aa 
Translation table12 
GC content43% 
IMG OID640392299 
ProductArylsulfatase (AS) (Aryl-sulfate sulphohydrolase) 
Protein accessionXP_001386634 
Protein GI126140224 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones10 
Plasmid unclonability p-value0.218235 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones16 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGACTGTCA CAGCCAAAAA GCCAAACTTC TTGGTTATTG TAGCAGACGA CTTGGGATTC 
ACCGATCTCA GTGCCTTTGG TGGAGAGATC CATACTCCCA ACTTGCAGAA ATTAGCAGAC
AGAGGAGTCA GGTTAACTGA TTTCCACACT GCTTCAGCAT GTTCACCCAC CAGGTCCATG
TTGTTTTCTG GTACCGACAA TCACATTGCC GGTTTGGGTC AAATGGCCGA ATTCGCATCC
AGACACCCCG AGAAGTTTGC TGGAAAGAAA GGGTATGAAG GCTACTTGAA CGATAGAGTA
GTAGCTTTGC CTGAGATATT GAAGGAGTAT GGTGACTACT TTACCTTTAT TTCTGGGAAG
TGGCATTTAG GTTTATTGCC AGAATATTGG CCAAGTAAGA GAGGGTTTGA AAAGTCATTC
ACATTGTTGC CAGGGGCCGG AAACCATTAC AAATATATCA CCAAGGACGA AAACGGCGAA
TTCGTAAAGT TCTTGCCTCC TTTGTTTGCT GAAGATGATA GAAGTGTTGA TGCTGAGAAG
GAGCTACCAG AAGATTTCTA CTCGACTGAT TATTTCACAG ATAAGGGAAT AGAGTTTATT
ACATCTGAAT CCAGAAATGG AAGACCCTTT TTCGGCTGCT TGACTTACAC TGCTCCCCAT
TGGCCATACC AGGCTCCTCA ATCTAGAATC GATAAGTACA AGGGTGTTTA CGACGGTGGC
CCGGAAGAGT TGCGTAGAAG ACGTTTGGCC AGTGCCGCTA AGATAGGAAT CATTCCTGAA
GGTGTAGTTC CCCATCCTAT AAAGACAATT CGTAAGAGAT GGTCAGAATT AACGGAAAAT
GAAAGAAAGA TCGAAGCCAG AATTATGGAA ACTTATGCTG CTATGGTTGA AATCTTAGAT
GAGAATATTG GGAGAGTAGT TGACCACTTA GAAAAGACAG GAGAGTTGGA TAATACTTTC
ATACTCTTCA TGTCTGATAA TGGTGCTGAA GGGATGTTGA TGGAAGCATT ACCATTGACT
GCTTTGAGAA TCAACACATT CATTGAAAAG TACTATAACA ATGCTCTTGA CAACATTGGT
AAGAAGGACT CGTTTGTTTT CTACGGAGAT CAGTGGGCTC AGGCAGCTAC TTCTCCTCAT
TCCATGTACA AGATGTGGTC AACTGAAGGA GCAATCGTTT GTCCACTTAT TATTCATTAC
CCTCCATTAT TAAAATCAAA ACAAAGCAAG ATCTTGGACG CTTTTACTAC GGTGATGGAT
ATATTGCCAA CAGTGTTGGA GTTGGCCGAT GTTAAACACC CAGGAAATTT CTACAAGGGA
AGAGAAGTTG CTGTACCAAG AGGATCATCA TGGGTAAGCT ACTTAGCCGA CAACAGCGAC
AGGGTACACC AAGAAGACAC TGTCACTGGT TGGGAATTAT TTGGACAGCA GGCTATCAGA
AAGGGTTCCT ACAAGGCTCT CTATATCCCA GCTCCATTTG GACCTGAAAA GTGGCAGTTG
TTCAATATTA AAGAGGATCC AGGTGAAACC AAGGATCTTG CGGAAACTGA AACTAGAGTC
CTTTCCGAGT TGATAAATCT CTGGGCTGTT TATGCTGCTG AAACTGGATT AATCGAGTTG
GGAAGCGACC TCTTCGAAAA GGAAAGAATC GAAGGAGAAG AGAATGAAGT GATTTACAGA
ACGATCTTAG ATAGTTAG
 
Protein sequence
MTVTAKKPNF LVIVADDLGF TDLSAFGGEI HTPNLQKLAD RGVRLTDFHT ASACSPTRSM 
LFSGTDNHIA GLGQMAEFAS RHPEKFAGKK GYEGYLNDRV VALPEILKEY GDYFTFISGK
WHLGLLPEYW PSKRGFEKSF TLLPGAGNHY KYITKDENGE FVKFLPPLFA EDDRSVDAEK
ELPEDFYSTD YFTDKGIEFI TSESRNGRPF FGCLTYTAPH WPYQAPQSRI DKYKGVYDGG
PEELRRRRLA SAAKIGIIPE GVVPHPIKTI RKRWSELTEN ERKIEARIME TYAAMVEILD
ENIGRVVDHL EKTGELDNTF ILFMSDNGAE GMLMEALPLT ALRINTFIEK YYNNALDNIG
KKDSFVFYGD QWAQAATSPH SMYKMWSTEG AIVCPLIIHY PPLLKSKQSK ILDAFTTVMD
ILPTVLELAD VKHPGNFYKG REVAVPRGSS WVSYLADNSD RVHQEDTVTG WELFGQQAIR
KGSYKALYIP APFGPEKWQL FNIKEDPGET KDLAETETRV LSELINLWAV YAAETGLIEL
GSDLFEKERI EGEENEVIYR TILDS