Gene PICST_80434 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagPICST_80434 
SymbolARO4 
ID4851199 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameScheffersomyces stipitis CBS 6054 
KingdomEukaryota 
Replicon accessionNC_009068 
Strand
Start bp1180620 
End bp1181803 
Gene Length1184 bp 
Protein Length365 aa 
Translation table 
GC content45% 
IMG OID640392907 
Product3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase isoenzyme 
Protein accessionXP_001387456 
Protein GI126274185 
COG category[E] Amino acid transport and metabolism 
COG ID[COG0722] 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase 
TIGRFAM ID[TIGR00034] phospho-2-dehydro-3-deoxyheptonate aldolase 


Plasmid Coverage information

Num covering plasmid clones20 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones
Fosmid unclonability p-value0.0171399 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATTTCCCTAC AACCATGTCC CAAACACCAG TACCTGATGA ATACGACGAC ACCAGAATCT 
TGGGCTACGA CCCTTTAATT CCACCTGCGT TGCTCCAGCA CGAAATCAAA GCCTCTTCTC
AATCTTTGGC TACTGTGATT AAAGGTCGTT ACGAATCTTC GCAAATCATC TCTGGAAAGG
ACGACAGAGC CTTGGTCATC GTTGGTCCAT GTTCGATTCA CGACACTGAC GCAGCCTTGG
AATACGCCGC TCGTTTGCGT AAGTTAGCCG ACGACTTGGA AGAAGACTTG GTAATTGTCA
TGAGAGCCTA CTTGGAAAAG CCTAGAACCA CTGTCGGTTG GAAGGGACTT ATTAACGATC
CAGATGTAGA TAATTCATTC GATATCAACC GTGGTTTGCG TATTTCTCGT CAGTTGTACT
CTGACTTAAC CGGCAAGATG GGTTTGCCTA TTGGTTCTGA AATGTTAGAT ACCATCTCTC
CTCAATATTT CTCGGACTTT TTGTCCTTCG GTGCCATTGG TGCCAGAACT ACTGAGTCAC
AGTTGCACAG AGAATTAGCT TCTGGACTTT CCTTCCCAAT TGGTTTCAAG AACGGTACTG
ATGGTGGTTT GACTGTTGCC TTGGATGCTG TACAAGCCTC TTCCAAGGGC CACCACTTCA
TGGGTGTTAC CAAGAATGGT ATGGCTGCCA TCACTACCAC CAAGGGTAAC GACAACTGTT
TCATCATCTT GAGAGGAGGT AAGAAGATCA CCAACTACGA TGCTGAGTCT GTTGCTTCTG
CCAAGGAAGC CATCTCCAAG TCGACCAACC CTAACATCAA GTTGATGATC GATTGTTCTC
ACGACAACTC TCAAAAGGAC TACAGAAACC AACCTAAGGT GTTGGACTCT GTTGTTGAAC
AAATCACTGC TGGTGAAGAT GCCATCATTG GTGTCATGAT CGAATCACAC ATCAACGAAG
GTAAGCAAAG TATGCCTGCT GAAGGTTGTA CCAAGGATTC ATTGAAGTAC GGTGTCTCCA
TCACTGATGG CTGTGTCTCT TGGGAATCCA CTGTTGAAAT GTTGACCAAG TTGTCTAACG
CTGTCAAAGC TAGAAGAGCC CTCAAGGCAT AAAAGTTATC ATAAGTTCCA AAAATTATAT
ATATATATAT AGGGTCTTTC CCTAATAAAG CTATTCATGT ATCT
 
Protein sequence
MSQTPVPDEY DDTRILGYDP LIPPALLQHE IKASSQSLAT VIKGRYESSQ IISGKDDRAL 
VIVGPCSIHD TDAALEYAAR LRKLADDLEE DLVIVMRAYL EKPRTTVGWK GLINDPDVDN
SFDINRGLRI SRQLYSDLTG KMGLPIGSEM LDTISPQYFS DFLSFGAIGA RTTESQLHRE
LASGLSFPIG FKNGTDGGLT VALDAVQASS KGHHFMGVTK NGMAAITTTK GNDNCFIILR
GGKKITNYDA ESVASAKEAI SKSTNPNIKL MIDCSHDNSQ KDYRNQPKVL DSVVEQITAG
EDAIIGVMIE SHINEGKQSM PAEGCTKDSL KYGVSITDGC VSWESTVEML TKLSNAVKAR
RALKA