Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | BURPS668_A0841 |
Symbol | |
ID | 4886895 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Burkholderia pseudomallei 668 |
Kingdom | Bacteria |
Replicon accession | NC_009075 |
Strand | + |
Start bp | 819802 |
End bp | 821370 |
Gene Length | 1569 bp |
Protein Length | 522 aa |
Translation table | 11 |
GC content | 69% |
IMG OID | 640130781 |
Product | sulfatase family protein |
Protein accession | YP_001061840 |
Protein GI | 126444813 |
COG category | [P] Inorganic ion transport and metabolism |
COG ID | [COG3119] Arylsulfatase A and related enzymes |
TIGRFAM ID | [TIGR03417] choline-sulfatase |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 25 |
Plasmid unclonability p-value | 0.767417 |
Plasmid hitchhiking | No |
Plasmid clonability | normal |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGAGCGCCC AAGCGATGCC CGATACCGCC GAACCCACCG ATATCCAGCC GAACATTCTC GTCCTGATGG CCGACCAGCT CACGCCCTTC GCGTTGCGCG CGTACGGCCA TCGCGCGACG CGTACGCCGA CGATCGACCG GCTCGCCGCC GAGGGCGTCG TCTTCGACGC CGCTTATTGC GCGAGCCCGC TCTGCGCGCC GTCGCGCTTC GCGCTGATGG CGGGCAAGCT GCCGTCGGCG CTCGGCGCTT ACGATAACGC CGCCGAATTG CCGGCGCAAA CGCTGACGTT CGCGCACTAC CTGCGCGCGG CCGGTTACCG GACGATGCTG TCGGGCAAGA TGCACTTCTG CGGGCCCGAT CAGTTGCACG GCTTCGAGGA ACGGCTCACG ACCGACATCT ATCCGGCCGA TTTCGGCTGG GTGCCGGACT GGACGCGTCC CGCCGAGCGG CCGAGCTGGT ATCACAACAT GAGCTCGGTG CTCGACGCCG GGCCTTGCGT GCGGACCAAC CAGCTCGATT TCGACGACGA TGCGACGTTC GCCGCGCGCC AGAAGATCTT CGACGTCGCG CGCGAGCGCG CGGCCGGCCG GGACGCGCGG CCGTTCTGCA TGGTCGTGTC GCTCACGCAT CCGCATGATC CGTATGCGAT CACGCGCGAA TACTGGGATC TGTACCGGGA CGAGGACATC GATCTGCCCG CCGTGCGGAT GGATTTCGAC GCGAGCGACC CGCATTCGCG GCGGCTGCGC GCCGTATGCG AGGTCGATCG CACGCCGCCG GAGGACCTGC AGATCCGGCG CGCGCGGCGC GCGTACTACG GCGCGACGTC CTATGTCGAC GCGCAGTTCG GCGCGCTGCT CGCGACGCTC GAGCAATGCG GGCTCGCCGA CGACACGATC GTGATCGTCA CCGCCGATCA CGGCGACATG CTCGGCGAGC GCGGCCTCTG GTACAAGATG ACGTTCTTCG AAGGCGCATG CCGCGTGCCG CTCATCGTGC ACGCGCCGCG CCGGTTTCCG GCCGCGCGCG TGCCGGCGGC CGTGTCGCAC GTCGATCTGC TGCCGACGCT CGTCGAGCTC GCGACGGGCG AGCGCCGCGC CGACTGGCCC GACGCCGTCG ACGGCCGCAG CCTCGTTCCC CATCTGCGCG GCGAAGGCGG CCATGACGAG GCGTTCGGCG AATATCTGGC CGAAGGCGCG ATCGCGCCGA TCGTGATGAT GCGCCGCGGC AGCCACAAGT ACATCCATTC GCCCGCGGAT CCGGATCAGC TCTTCGATCT GAGGAATGAT CCGCGCGAGC TCGACAATCT CGCGAACACG CCCGCCGCGG CAAAGCACGT CGCCGCGTTT CGCATGGAGC GCGTCGCGCG CTGGGATCTC GATGCGCTGC ATCAGCAGGT GCTCGCGAGC CAGCGCAGGC GGCGCTTCCA TTTCGAGGCG ACGACCCAGG GGCGAATCCG GTCGTGGGAC TGGCAGCCGT TCCAGGATGC GAGCCAGCGT TACATGCGCA ATCACCTCGA ACTCGACGCG CTCGAGGCAG CCGCGCGTTT TCCTCGTCCG CACGCATGA
|
Protein sequence | MSAQAMPDTA EPTDIQPNIL VLMADQLTPF ALRAYGHRAT RTPTIDRLAA EGVVFDAAYC ASPLCAPSRF ALMAGKLPSA LGAYDNAAEL PAQTLTFAHY LRAAGYRTML SGKMHFCGPD QLHGFEERLT TDIYPADFGW VPDWTRPAER PSWYHNMSSV LDAGPCVRTN QLDFDDDATF AARQKIFDVA RERAAGRDAR PFCMVVSLTH PHDPYAITRE YWDLYRDEDI DLPAVRMDFD ASDPHSRRLR AVCEVDRTPP EDLQIRRARR AYYGATSYVD AQFGALLATL EQCGLADDTI VIVTADHGDM LGERGLWYKM TFFEGACRVP LIVHAPRRFP AARVPAAVSH VDLLPTLVEL ATGERRADWP DAVDGRSLVP HLRGEGGHDE AFGEYLAEGA IAPIVMMRRG SHKYIHSPAD PDQLFDLRND PRELDNLANT PAAAKHVAAF RMERVARWDL DALHQQVLAS QRRRRFHFEA TTQGRIRSWD WQPFQDASQR YMRNHLELDA LEAAARFPRP HA
|
| |