Gene BURPS668_A0841 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagBURPS668_A0841 
Symbol 
ID4886895 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameBurkholderia pseudomallei 668 
KingdomBacteria 
Replicon accessionNC_009075 
Strand
Start bp819802 
End bp821370 
Gene Length1569 bp 
Protein Length522 aa 
Translation table11 
GC content69% 
IMG OID640130781 
Productsulfatase family protein 
Protein accessionYP_001061840 
Protein GI126444813 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID[TIGR03417] choline-sulfatase 


Plasmid Coverage information

Num covering plasmid clones25 
Plasmid unclonability p-value0.767417 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGAGCGCCC AAGCGATGCC CGATACCGCC GAACCCACCG ATATCCAGCC GAACATTCTC 
GTCCTGATGG CCGACCAGCT CACGCCCTTC GCGTTGCGCG CGTACGGCCA TCGCGCGACG
CGTACGCCGA CGATCGACCG GCTCGCCGCC GAGGGCGTCG TCTTCGACGC CGCTTATTGC
GCGAGCCCGC TCTGCGCGCC GTCGCGCTTC GCGCTGATGG CGGGCAAGCT GCCGTCGGCG
CTCGGCGCTT ACGATAACGC CGCCGAATTG CCGGCGCAAA CGCTGACGTT CGCGCACTAC
CTGCGCGCGG CCGGTTACCG GACGATGCTG TCGGGCAAGA TGCACTTCTG CGGGCCCGAT
CAGTTGCACG GCTTCGAGGA ACGGCTCACG ACCGACATCT ATCCGGCCGA TTTCGGCTGG
GTGCCGGACT GGACGCGTCC CGCCGAGCGG CCGAGCTGGT ATCACAACAT GAGCTCGGTG
CTCGACGCCG GGCCTTGCGT GCGGACCAAC CAGCTCGATT TCGACGACGA TGCGACGTTC
GCCGCGCGCC AGAAGATCTT CGACGTCGCG CGCGAGCGCG CGGCCGGCCG GGACGCGCGG
CCGTTCTGCA TGGTCGTGTC GCTCACGCAT CCGCATGATC CGTATGCGAT CACGCGCGAA
TACTGGGATC TGTACCGGGA CGAGGACATC GATCTGCCCG CCGTGCGGAT GGATTTCGAC
GCGAGCGACC CGCATTCGCG GCGGCTGCGC GCCGTATGCG AGGTCGATCG CACGCCGCCG
GAGGACCTGC AGATCCGGCG CGCGCGGCGC GCGTACTACG GCGCGACGTC CTATGTCGAC
GCGCAGTTCG GCGCGCTGCT CGCGACGCTC GAGCAATGCG GGCTCGCCGA CGACACGATC
GTGATCGTCA CCGCCGATCA CGGCGACATG CTCGGCGAGC GCGGCCTCTG GTACAAGATG
ACGTTCTTCG AAGGCGCATG CCGCGTGCCG CTCATCGTGC ACGCGCCGCG CCGGTTTCCG
GCCGCGCGCG TGCCGGCGGC CGTGTCGCAC GTCGATCTGC TGCCGACGCT CGTCGAGCTC
GCGACGGGCG AGCGCCGCGC CGACTGGCCC GACGCCGTCG ACGGCCGCAG CCTCGTTCCC
CATCTGCGCG GCGAAGGCGG CCATGACGAG GCGTTCGGCG AATATCTGGC CGAAGGCGCG
ATCGCGCCGA TCGTGATGAT GCGCCGCGGC AGCCACAAGT ACATCCATTC GCCCGCGGAT
CCGGATCAGC TCTTCGATCT GAGGAATGAT CCGCGCGAGC TCGACAATCT CGCGAACACG
CCCGCCGCGG CAAAGCACGT CGCCGCGTTT CGCATGGAGC GCGTCGCGCG CTGGGATCTC
GATGCGCTGC ATCAGCAGGT GCTCGCGAGC CAGCGCAGGC GGCGCTTCCA TTTCGAGGCG
ACGACCCAGG GGCGAATCCG GTCGTGGGAC TGGCAGCCGT TCCAGGATGC GAGCCAGCGT
TACATGCGCA ATCACCTCGA ACTCGACGCG CTCGAGGCAG CCGCGCGTTT TCCTCGTCCG
CACGCATGA
 
Protein sequence
MSAQAMPDTA EPTDIQPNIL VLMADQLTPF ALRAYGHRAT RTPTIDRLAA EGVVFDAAYC 
ASPLCAPSRF ALMAGKLPSA LGAYDNAAEL PAQTLTFAHY LRAAGYRTML SGKMHFCGPD
QLHGFEERLT TDIYPADFGW VPDWTRPAER PSWYHNMSSV LDAGPCVRTN QLDFDDDATF
AARQKIFDVA RERAAGRDAR PFCMVVSLTH PHDPYAITRE YWDLYRDEDI DLPAVRMDFD
ASDPHSRRLR AVCEVDRTPP EDLQIRRARR AYYGATSYVD AQFGALLATL EQCGLADDTI
VIVTADHGDM LGERGLWYKM TFFEGACRVP LIVHAPRRFP AARVPAAVSH VDLLPTLVEL
ATGERRADWP DAVDGRSLVP HLRGEGGHDE AFGEYLAEGA IAPIVMMRRG SHKYIHSPAD
PDQLFDLRND PRELDNLANT PAAAKHVAAF RMERVARWDL DALHQQVLAS QRRRRFHFEA
TTQGRIRSWD WQPFQDASQR YMRNHLELDA LEAAARFPRP HA