Gene Pars_0452 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagPars_0452 
Symbol 
ID5056101 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism namePyrobaculum arsenaticum DSM 13514 
KingdomArchaea 
Replicon accessionNC_009376 
Strand
Start bp394949 
End bp396037 
Gene Length1089 bp 
Protein Length362 aa 
Translation table11 
GC content58% 
IMG OID640468017 
Productcellulase 
Protein accessionYP_001152702 
Protein GI145590700 
COG category[G] Carbohydrate transport and metabolism 
COG ID[COG1363] Cellulase M and related proteins 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones14 
Plasmid unclonability p-value0.547156 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones37 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGAGGACT TCGTACAGCT TTTGAAAAAG CTCTCGGAGG CGAGGGGGCC GTCGGGCTTT 
GAGGACGAGG TTAGAGAGGT TGTAATAAGG GAAATGGAGC CTTATGTGGA CGAGGTAGTT
GTGGATAAGT GGGGCAACGT CATCGGGGTG AAGAGGGGTT CCTCAGACTA CAGGGCCATG
GTGGCGGCGC ACATCGACGA GATCGGACTC GTCGTGGACC ACATAGAGAA GGAGGGCTAC
CTGAGATTCA GACCAATTGG AGGGTGGAAT GAGGTTACTC TCGTCAGCCA GCGGGTCTGG
GTGAGGACTT CAGATGGCCG GTGGATAAGA GGGGTTGTGG GGTCTCTGCC GCCGCATGTA
ACGCCGAGCG GGAGGGAGCG CGAGGCGCCT GAGATTAAGG ACTTGTATAT AGACATCGGC
GTGTATAGCA GAGAGGAGGC GGAGAAGCTA GGCGTCACCG TCGGATCTGT GGTTGTGCTG
GATAGGGAAT TCGCCGTGTT GAACGGAAAG GTGGTTACCG GGAAGGCGTT TGACGACAGG
GTGGGAGTAG CCGTGATGCT CTACGCCTTG AGGATGCTGG AAAAACTACC TGTCACCCTC
TACGCCGTGG CGACCGTCCA GGAGGAGGTT GGACTCCGCG GGGCAAGCGT CGCTGCAGAG
CGAATTAATC CCCACTACGC TCTTGCCTTG GACACCACAA TTGCCGCCGA CGTGCCGGGC
GTGGGGGAGA GGCTCCACGT AACTAAGCTG GGCAAGGGCC CGGCCATAAA GGTGCTGGAC
GGCGGTAGGG GCGGCCTATT CATAGCCCAC CCAGGTCTGA GGGATCACAT CGTGAAGTTG
GCGAGGGAGC TCGGTATTCC CTACCAGATG GAGGTTCTTT ACGGCGGTAC CACCGACGCC
ATGGCCATAG CCTTTAGGAG GGAGGGCGTC CCCGCCGCTG TTATCTCGGT GCCTACGCGG
TATATCCACT CCCCGGTAGA GGTGCTCGAC GTGGAGGACG CTGTAAACGC GGCTAAGTTG
CTTAAGGCAA CGCTGGAAAG GACTACGCCG GAGATCGTGG AGAAGTTTCT TGACAAGAGA
GTTAAGTAG
 
Protein sequence
MEDFVQLLKK LSEARGPSGF EDEVREVVIR EMEPYVDEVV VDKWGNVIGV KRGSSDYRAM 
VAAHIDEIGL VVDHIEKEGY LRFRPIGGWN EVTLVSQRVW VRTSDGRWIR GVVGSLPPHV
TPSGREREAP EIKDLYIDIG VYSREEAEKL GVTVGSVVVL DREFAVLNGK VVTGKAFDDR
VGVAVMLYAL RMLEKLPVTL YAVATVQEEV GLRGASVAAE RINPHYALAL DTTIAADVPG
VGERLHVTKL GKGPAIKVLD GGRGGLFIAH PGLRDHIVKL ARELGIPYQM EVLYGGTTDA
MAIAFRREGV PAAVISVPTR YIHSPVEVLD VEDAVNAAKL LKATLERTTP EIVEKFLDKR
VK