Gene Pars_2044 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagPars_2044 
Symbol 
ID5056302 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism namePyrobaculum arsenaticum DSM 13514 
KingdomArchaea 
Replicon accessionNC_009376 
Strand
Start bp1826728 
End bp1828785 
Gene Length2058 bp 
Protein Length685 aa 
Translation table11 
GC content62% 
IMG OID640469593 
ProductAlpha-glucosidase 
Protein accessionYP_001154242 
Protein GI145592240 
COG category[G] Carbohydrate transport and metabolism 
COG ID[COG1501] Alpha-glucosidases, family 31 of glycosyl hydrolases 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones16 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones32 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
GTGTCCCTAG AGCTTGGGAG GAAGAGCGTT AAGCTCGCGT TGGGGAGGCA GATCCTCTCC 
TTCCCCCTCC CAGGTGGCGA GCCCAGGCCG GCCGCCGAGT TTGACGGCAA GTCGCTGAGG
GCCACTTGTT GCGGCGTAGG TGTGGAGCTG TTAGTGGAGA GGCGGGGCGG CGCGACTGTT
GTTAGGAAGA GACTGGGGGC GAGGGAGCAC GTCTTCGGTC TCGGGACTAG GGCCTACCCG
CCGGATAGGA GGAGGGGACG CTTCATCCTC TTCAATAACG ACTTGTACGG ATACCAGCTG
GGGATGGACC CCCTATACGC CTCGATCCCC TTTATGGTGT TTGTAGAGGA CGGGAGGGCC
TTCGGCCTCG TGGTTAACAG CCCGGCTTAC GGCGTCGTCG ACGTGGGGTT TTCGAAATAC
AGCGAGGCGG TCGTGGAGGT GGAGGATCTC CCGGAGCTGT ATATACTCTT CGGCGAGGGG
CCGCTGGATG TATATACCAC CTACAGCGAG GTTACGGGGA GGCCCTTCCT CCCGCCGTCG
TGGGCCCTGG GCCTGCACCT ATCGCGCTAC AGCTACGAGC CCCAGGACGC GGCGGCCGAG
GTGGTGCGGG AGGTCGCACG GGAGGTGCCC CTTGACGCGG TGTATCTCGA CATAGACTAC
ATGGAGGGGT ACAAGCAGTT CACGTGGGAT TTGAAGAAGT TCCCAGACCC CGCCGGGTTT
GTCCACGAGA TACGCGAGCT GGGCGTCAGG GTGGTGGCGA TTGTAGACCC CTACATAAAG
GCGGAGCCGG GGTACAGGCC GTTCCGCGAG TTGCTTGACT GCCTCCTCGT GACGGAGAAC
GACGAGCTCT TCCTCGCCAG GGGGTGGCCG GGGCTCTCCG CCCTGCCGGA CTTCCTAAAC
CCCAAGTGCA GGGAGAGGTG GGGCGACTTA ATAGCCGACT TTGTAAAGAC GTACGGCATA
GACGGCATCT GGCTTGACAT GAACGAGCCG ACTGTTTTCA ACTGCGACGC CCTAGCTACT
AGGTCTAGGA TATACGCCCT GGTCGGTGCC ACGCCGCACG GCCTAACAAA AGAGGAGCTT
TTATGCAAGG CGCCGAGAGG CGCCTACCAT GTAGTGGAGG GGGAGAAAAT CACCCACGAG
AGAGTCCGGG GGCTTTACCC CTACTTCGAG GCGGAGGCCA CCTACCGCGG CCTCCTCAAG
GCGGGAAGGG AGCCCTTTAT CCTCTCCCGG TCCGGCTACC TCGGCATACA GAAATACGCG
GCGTTGTGGA CTGGGGACGT CCCCTCCACC TGGGAGGGCT TGAGGCTGAC CCTCATGACG
GTGCTTGGCC TCTCGGCCTC CGGCGTGCCC TTCGTGGGCG CGGATGTGGG CGGCTTCGCC
GGGCTGGGTG ACTACGAGCT AATCGCCAGG TGGTACCAAG CGGCGGCCTT CTTCCCCATC
TACCGCGTCC ACAGAGACAA GGGCACCACA GACGCGGAGC CCACGAGGCT ACCGGCGCGG
TACCGACACA TGGCGCTGGA GGCTATTAAG ACGAGGCTTA GGTTTCTGCC CTATCTAAGA
CACCTGGCAT GGGCCGCCCA CCTACACGGC TACCCCATCG TGAGGCCGCT GGGCCTTGAG
TTCCCCGACG ACGAAGACGC CTTTAAGATA CATGACGAGT ACATGGTGGG GCCTTTCCTC
CTCTTCGCCC CTATTGTGGA CAAGGGAGCC AAGCAGAGGG ACGTCTACCT CCCGCGGGGG
GCGTGGCTGG AGCTAAGCAC GGGGAAGACC CACCTAGGCC CCAGCTGGAC CCTCTCCGAG
TCGGACATGC CGCTCTACAT CCGGTCGGGG TCGGCCGTGC CCGCCCAGGA TGGGCTTTAC
ATATACGGGG AGGGGGCGTG GACGATCTAC ACAGACGGGG GCGTCGTGGA GGTTGCGAGA
GAGGGCGGGA AGGTGAGGCT AAAGGGATGG ATGGGGAGTG TATATATCCT CGGCGAGAGG
CTAGACGAGG TTTATATAAA CGGCGCGGCG AAAACAGGCG TTGCCACAAA ACTCGGCACC
TATCTAGAGA CGACCTAG
 
Protein sequence
MSLELGRKSV KLALGRQILS FPLPGGEPRP AAEFDGKSLR ATCCGVGVEL LVERRGGATV 
VRKRLGAREH VFGLGTRAYP PDRRRGRFIL FNNDLYGYQL GMDPLYASIP FMVFVEDGRA
FGLVVNSPAY GVVDVGFSKY SEAVVEVEDL PELYILFGEG PLDVYTTYSE VTGRPFLPPS
WALGLHLSRY SYEPQDAAAE VVREVAREVP LDAVYLDIDY MEGYKQFTWD LKKFPDPAGF
VHEIRELGVR VVAIVDPYIK AEPGYRPFRE LLDCLLVTEN DELFLARGWP GLSALPDFLN
PKCRERWGDL IADFVKTYGI DGIWLDMNEP TVFNCDALAT RSRIYALVGA TPHGLTKEEL
LCKAPRGAYH VVEGEKITHE RVRGLYPYFE AEATYRGLLK AGREPFILSR SGYLGIQKYA
ALWTGDVPST WEGLRLTLMT VLGLSASGVP FVGADVGGFA GLGDYELIAR WYQAAAFFPI
YRVHRDKGTT DAEPTRLPAR YRHMALEAIK TRLRFLPYLR HLAWAAHLHG YPIVRPLGLE
FPDDEDAFKI HDEYMVGPFL LFAPIVDKGA KQRDVYLPRG AWLELSTGKT HLGPSWTLSE
SDMPLYIRSG SAVPAQDGLY IYGEGAWTIY TDGGVVEVAR EGGKVRLKGW MGSVYILGER
LDEVYINGAA KTGVATKLGT YLETT