Gene PICST_33061 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagPICST_33061 
SymbolGBO1 
ID4840419 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameScheffersomyces stipitis CBS 6054 
KingdomEukaryota 
Replicon accessionNC_009046 
Strand
Start bp1335771 
End bp1337132 
Gene Length1362 bp 
Protein Length453 aa 
Translation table12 
GC content43% 
IMG OID640391734 
ProductGamma-butyrobetaine dioxygenase (Gamma-butyrobetaine,2-oxoglutarate dioxygenase) (Gamma-butyrobetaine hydroxylase) (Gamma-BBH) 
Protein accessionXP_001385951 
Protein GI150866374 
COG category[Q] Secondary metabolites biosynthesis, transport and catabolism 
COG ID[COG2175] Probable taurine catabolism dioxygenase 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones19 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones14 
Fosmid unclonability p-value0.563607 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGATACCAA ATTTGTATAG AGTTAATAAA ACGAAGCTAC CCCGCGTTTT AGCGAGATCC 
GTGAGATTTC AGCTGTCGCT TGCTATTCAT AGATACGATG ATAACTATAC AACTTTGGTA
TTTGATGGAG ATAGGTCAAT TCTGTTCAGT AACATCTTTC TCCGTGATTC ATGCAGGGAT
CCTAAGTCTG TAGACACTTA CTCGAGCCAG AAGCTTTTCA CCACAGCAGA AATCGCAAAG
AATTTGTCCA TTAACTCACC TCCCCGGATT AGAAAATCAC CAGATTCAAG CGAGTCTGTT
TTAGAAATAG AATGGCTCCA AAACGGAAAA CTCCATCTTT CCCAGTACCA GGAGACGTTC
TTGAGAGAGT ACATTGATGC TGAGTCAAGG CAAGCTGGTA AATTCTTTGA AGGTGAAAGA
ACAATATGGG ACAACAAGGA ACTCGTTGGA AACCTTCCCA GCATACAAGC TGATTACAAG
AAGTACTTGG AACTGGATTC TACTTTCTTT GAAACAGTCA GAAGCCTCAA CAAGTTTGGT
TTGGCATTTG TAAACGACAT TCCGGAACCT TCAGCCGAGC TTCAGAAACT GGGAATGAAT
GAAAAGAATG CCGCAGAATG GCCGGTGTCA AAGCTTGCTA ACAAATTTGG TTATATCAAG
AAGACATTCT ATGGTACTTT ATTTGACGTT AAAAACGAAA AGGAGGAGGC AAAGAACATT
GCAAACACAA ACACATTCTT GCCGTTGCAC ATGGACCTCT TGTACTACGA ATCGCCGCCA
GGATTGCAAT TGCTTCATTT CATCAAGAAC TCTACAACAG GCGGAGAGAA TGTCTTCTGC
GATTCCTTCC TTGCGGCTGA ACATGTCAAA AATGTAGATC CAACAGCATA TGTTGCTTTG
ACGCTTGTCC CCATTACCTA TCATTATGAT AACAACAACG AGCACTATTT CTTCAAGAGG
CCTTTGGTAG TGGAAGAAGT GAAAGGCGAT ACGGCTCGTA TCAAAGAAGT CAACTACGCT
CCACCATTTC AAGGGCCATT TGAGTTTGGA ATAACCAGAA ATGACTCCGA GAGGGAAGGA
TTGTTTTTGG CTAAAGATAC CACAGACGGT CTTTTGTTCC AGGACTTTAT CAGAGGATTC
CAGCTCTTCG AAGACTTCAT CAACGACCCC GTGAACCACT ACGAAATCAA GATGCCAGAA
GGCTCTTGTG TTATATTCGA CAACAGAAGA GTTCTCCACT CCCGTCTTGG ATTCAGTGAC
TCCAACGGAG GAGATAGATG GCTCATGGGA ACCTATGTAG ACGGCGATAG TTTCAGATCC
AAGTTGAGAA TGGGCTTCAG ACACTTGAAA GAAGCTATGT AA
 
Protein sequence
MIPNLYRVNK TKLPRVLARS VRFQSSLAIH RYDDNYTTLV FDGDRSISFS NIFLRDSCRD 
PKSVDTYSSQ KLFTTAEIAK NLSINSPPRI RKSPDSSESV LEIEWLQNGK LHLSQYQETF
LREYIDAESR QAGKFFEGER TIWDNKELVG NLPSIQADYK KYLESDSTFF ETVRSLNKFG
LAFVNDIPEP SAELQKSGMN EKNAAEWPVS KLANKFGYIK KTFYGTLFDV KNEKEEAKNI
ANTNTFLPLH MDLLYYESPP GLQLLHFIKN STTGGENVFC DSFLAAEHVK NVDPTAYVAL
TLVPITYHYD NNNEHYFFKR PLVVEEVKGD TARIKEVNYA PPFQGPFEFG ITRNDSEREG
LFLAKDTTDG LLFQDFIRGF QLFEDFINDP VNHYEIKMPE GSCVIFDNRR VLHSRLGFSD
SNGGDRWLMG TYVDGDSFRS KLRMGFRHLK EAM