Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | PICST_29579 |
Symbol | |
ID | 4837158 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Scheffersomyces stipitis CBS 6054 |
Kingdom | Eukaryota |
Replicon accession | NC_009042 |
Strand | - |
Start bp | 610014 |
End bp | 613070 |
Gene Length | 3057 bp |
Protein Length | 1018 aa |
Translation table | 12 |
GC content | 43% |
IMG OID | 640388473 |
Product | predicted protein |
Protein accession | XP_001382881 |
Protein GI | 150864164 |
COG category | [O] Posttranslational modification, protein turnover, chaperones |
COG ID | [COG5160] Protease, Ulp1 family |
TIGRFAM ID | |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 10 |
Plasmid unclonability p-value | 1 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | 12 |
Fosmid unclonability p-value | 0.198917 |
Fosmid Hitchhiker | No |
Fosmid clonability | normal |
| |
Sequence |
Gene sequence | ATGAGTTCCG AGCGTTCGAG TCTTCACTCG AAGGACTTTC GGTCGCTCAC CAGTCGCCCC AAGACGGTCC AAATCAAGTC CGGAACTTCT GGATCCGGCA GCACTTCACT TATAACCTCG AGTCATCTTG GGGGAGGCAT CAAGCCCATG AACAAACTCA CAGACAAGTC CCGCATCTCT CCAAGAAGTC ATCTTGTGAC AAGACGAAGC CCAACATCGC CTAAATCCCC ACTAGCCAAA GAGTTGCTGA GCAAATTGCA CCGAGCATCG GCCATCAGAG GTAAGCTCGA TCAACTCTCA CATTCAAAAT CAAATTCTCA TTATACTTCA AATTCTCACA TAAGTAGCTC TCATTCAATT TTTGCACAAA CAAAAACAAA GCCAGTCCGT ATTCCCAGTC CAAGAACTGT AGGAATACCA AGAGATGACG CAATTCAGGT TTCTGCTGTG AATTTGCCGT TGGAAAAGTC TGAATCTGAA TCTACAAATG TAATTAGTAG TGATAAGAAT ATTAGTGCTA AGAGTTCGCC TGATTCTAAC AAACAGTTAC CGCAACTCAG AGATATATTC GGCAATGTGG TTAATCGTAG CTCCAGTCTG ATAGACGAGA AGCCGCGAGT TACTATGAGT TTGTCTTCAC TAGAAATAGA AGGTCAGGAT GAAGCTCTGC ACTTAGTAAA GTTGCATATC TCTGCCAACA ACGAGTATAT CTTTCTCACA CGTTATGGCA AGATCGTAGA ATTCAGTGTA ATTCATGGCG AAGACCAGCT TCAATGGATC AAGTTTGCTG AATATGCAGT TGTACTCCGT ATCGCTGGCA CTGGCAGCTA TTGGATAGCC AACCACGACT TGGAAGATGA CGAGTTCAGA GACAATTTTC GCCAGCACCA GCGGTGGAAC GTACAACCCG ACGTAACAAT GACGAGGGAT GATCTAGACA TAGTGCGGAC GAATATGGCC AACAATGCCA ATAAAAAGCA GCCTGGTAGA TTGAGCTCGT TGTTCAACAT TAACAAGATT TCACCCAAGA AGTTCTTTGG TGATGGACAA CGAGCTACGA GATCAAAAAC CAGTGATCAT TTTGATTTCA AGCGATTGGA TCAGTCTGAG AATAGCGAAA GTTTGCAAAA AGTTCAAGTT CCCGTTATTC GAGAGACTCC GAAGTCGTTT TCACCAGACT TGCAATACAC CTTTGCCAAC AATAACGTAT TCAAAATAGC TTATTCTGAC TTCAAGACAC TCTACAACAA TGAATGGATC AATGATACCA TCATTGACTT CTTTATTCAA TATGAAATCG ACAGAGCTAT CAAAGAACGC CGTGTGCGAG AGAACGAAGT CTATGCATTC AACTCATTTT TCTTCACTAA GTTGATGTCC AAATCTGCCA CTCAAGATTC GCCTGATTAC TACGGTAATA TCAAGCGATG GCTCAGTAAA GTCGACCTCA TGTCATATCC ATATGTCATT ATCCCTATCA ACGAACATGC CCATTGGTAC TGTTCCATCA TAAGAGGATT ACCTGAGTTG TTAAAAGGAG CGCAAAATCA GAAGGCCACT ATTCAAGTTC CCGATAGTCA GGAAGACGAA CAAGACTCTG TAGACTCTTC TGGCAGTTCA CAGGAGCCAT CTGAAGTAGG TGGTGTAAGC CACTCGGCAA ACTCGCACAT AGACTTTGAT GAAGAACCTG TAGACGCCAA AGCCGTCCCA ACTTCCCGGG CAGAGATCTT TGTTTTCGAC TCGTTGGGTC AACAGCACAA CCAAATCAAA GTTCCTTTGA AACGGTTCAT CATAGACTAT TGCAAGGAAA AATATAATGT AGACATAGTC AAGGCTCAGA TCAGGGTCGT AACAGCTAAA GTGCCCAAAC AGAACAACTT CAATGATTGT GGAATACATG TGATCTACAA CGTCCGAAAA TGGCTAGGAG ATATTTCTCT TTGTGAGAAA TTATGGAGAG GTTCGTATCT GACCCGTACG GCACGGTCGT TGTTTTTGGC CGAAGAAAGA AACGGTATGA GAAAGCAGTT GATTACTAAA TTGCTTGAGT TACACAAGGA TAGAATTGTA GGCGAGAGTG ACTTGCTGGA CGAAGTAAAT CAAGATGCGC TCTCAGACGA TGATTTGGAG GTGATAGAGT TCCATGCTAA CGACAGAGAT GCCAGAGCAG CCAGAGCAGC AAACAGAGCA GACAATGCCG ATAAAGGAGG TAATGTCAAC AAAGGAGGAA ATGTCGACAA AGGAGGAAAT GTAGACAAAG GAGGTAATGT CGACAAAGGA AGTAATGTAG AAGTTAGAAG TGTAAGAGAT AAAACTGAGA ATAGTACTGA TGGCAAAGTT AGTGTTAATT CTTCAATCGG CGCACTTTCT TCAATCGATG TCAATTCTTC TAATGAGTAT CCCAACACTT TAGATCCTCG GTCGTTCCTG AAACAGTCTC CTTCTAATGG GGAGCTGATG ATCAATGATA GTCTCCGGAA ACATTTCTCA ACCGATGTGT TGCCTACATT TGTCATTCGT TTTCTCAACG AGAACTTCAA CAAGAAGAAC AGAGAACTCG ATAGCACCAT ACTTGCTATA ATTTCTCGCG AAATTCGTAG CTTAAAGCAC TTGGAAGAAC AAGACAAAAG AGTAGCATCT TCATTCCAGA AGGTATTGAC AGCCATCGAT GAATACCGTG TCCCTGAAAA CGAGGCTATT CGGCCAAAGA ATAAGGAGTT TAAAATCCAG GATAGCTACG TAGAAGACAG CATTCAAAGA GGAGCTATTC TGTCACCTTC GCTCTACGAC TCAGACGATA TCAACGAGAG TGTTTCTCAA TTGGCGATTT CAACCAAATC TCCCCAAACG CCTAGAAGAG TGCAAACTAC TCCATTGAAG GTAACAGACT CAATATCAGA AGTGATTGAG GCAGTCCCGC AAGTCATTGA GGACGTTAGT TCGGCCGCTT CTTCGGATCT AGAGATTGTT TCAGATGGGG AGTTATCGCC AGCGATTCGT TCGAAGCTGA AGTTGCCCAC AAAGGAAATA GTAAGCAAAC GTCGCAAATT GAAGTAG
|
Protein sequence | MSSERSSLHS KDFRSLTSRP KTVQIKSGTS GSGSTSLITS SHLGGGIKPM NKLTDKSRIS PRSHLVTRRS PTSPKSPLAK ELSSKLHRAS AIRGKLDQLS HSKSNSHYTS NSHISSSHSI FAQTKTKPVR IPSPRTVGIP RDDAIQVSAV NLPLEKSESE STNVISSDKN ISAKSSPDSN KQLPQLRDIF GNVVNRSSSS IDEKPRVTMS LSSLEIEGQD EASHLVKLHI SANNEYIFLT RYGKIVEFSV IHGEDQLQWI KFAEYAVVLR IAGTGSYWIA NHDLEDDEFR DNFRQHQRWN VQPDVTMTRD DLDIVRTNMA NNANKKQPGR LSSLFNINKI SPKKFFGDGQ RATRSKTSDH FDFKRLDQSE NSESLQKVQV PVIRETPKSF SPDLQYTFAN NNVFKIAYSD FKTLYNNEWI NDTIIDFFIQ YEIDRAIKER RVRENEVYAF NSFFFTKLMS KSATQDSPDY YGNIKRWLSK VDLMSYPYVI IPINEHAHWY CSIIRGLPEL LKGAQNQKAT IQVPDSQEDE QDSVDSSGSS QEPSEVGGVS HSANSHIDFD EEPVDAKAVP TSRAEIFVFD SLGQQHNQIK VPLKRFIIDY CKEKYNVDIV KAQIRVVTAK VPKQNNFNDC GIHVIYNVRK WLGDISLCEK LWRGSYSTRT ARSLFLAEER NGMRKQLITK LLELHKDRIV GESDLSDEVN QDALSDDDLE VIEFHANDRD ARAARAANRA DNADKGGNVN KGGNVDKGGN VDKGGNVDKG SNVEVRSVRD KTENSTDGKV SVNSSIGALS SIDVNSSNEY PNTLDPRSFS KQSPSNGESM INDSLRKHFS TDVLPTFVIR FLNENFNKKN RELDSTILAI ISREIRSLKH LEEQDKRVAS SFQKVLTAID EYRVPENEAI RPKNKEFKIQ DSYVEDSIQR GAISSPSLYD SDDINESVSQ LAISTKSPQT PRRVQTTPLK VTDSISEVIE AVPQVIEDVS SAASSDLEIV SDGELSPAIR SKSKLPTKEI VSKRRKLK
|
| |