Gene Information Plasmid Coverage information Fosmid Coverage information Sequence |
Gene Information |
Locus tag | SbBS512_E0452 |
Symbol | purK |
ID | 6271010 |
Type | CDS |
Is gene spliced | No |
Is pseudo gene | No |
Organism name | Shigella boydii CDC 3083-94 |
Kingdom | Bacteria |
Replicon accession | NC_010658 |
Strand | + |
Start bp | 440070 |
End bp | 441137 |
Gene Length | 1068 bp |
Protein Length | 355 aa |
Translation table | 11 |
GC content | 57% |
IMG OID | 641724677 |
Product | phosphoribosylaminoimidazole carboxylase ATPase subunit |
Protein accession | YP_001879225 |
Protein GI | 187730443 |
COG category | [F] Nucleotide transport and metabolism |
COG ID | [COG0026] Phosphoribosylaminoimidazole carboxylase (NCAIR synthetase) |
TIGRFAM ID | [TIGR01161] phosphoribosylaminoimidazole carboxylase, PurK protein |
|
|
Plasmid Coverage information |
Num covering plasmid clones | 14 |
Plasmid unclonability p-value | 0.0002427 |
Plasmid hitchhiking | Yes |
Plasmid clonability | hitchhiker |
| |
Fosmid Coverage information |
Num covering fosmid clones | n/a |
Fosmid unclonability p-value | n/a |
Fosmid Hitchhiker | n/a |
Fosmid clonability | n/a |
| |
Sequence |
Gene sequence | ATGAAACAGG TTTGCGTCCT CGGTAACGGG CAGTTAGGCC GTATGCTGCG TCAGGCAGGC GAACCGTTAG GCATTGCTGT CTGGCCGGTC GGGCTGGACG CTGAACCGGC GGCGGTGCCT TTTCAACAAA GCGTGATTAC CGCTGAGATC GAACGCTGGC CGGAAACCGT ATTAACCCGC GAGCTGGCGC GTCATCCGGC CTTTGTGAAC CGCGATGTGT TCCCGATTAT TGCTGACCGT CTGACTCAGA AGCAGCTTTT CGATAAGCTC CACCTGCCGA CCGCGCCGTG GCAGTTACTT GCCGATCGCA GCGAGTGGCC TGCGGTGTTT GATCGTTTAG GTGAACTGGC GATTGTTAAG CGTCGCACTG GTGGCTATGA CGGTCGCGGT CAATGGCGTT TACGCGCCAA TGAAACCGAA CAGTTACCGG CAGAGTGTTA CGGCGAATGT ATTGTCGAGC AGGGCATTAA CTTCTCTGGT GAAGTGTCGC TGGTTGGCGC GCGCGGCTTT GATGGCAGCA CCGTGTTTTA TCCGCTGACG CATAACCTGC ATCAGGACGG TATTTTGCGC ACCAGCGTCG CTTTTCCGCA GGCCAACGCA CAGCAGCAGG CGCAAGCCGA AGAGATGCTG TCGGCGATTA TGCAGGAGCT GGGCTATGTG GGCGTGATGA CGATGGAGTG TTTTGTCACC CCGCAAGGTC TGCTGATCAA CGAACTGGCT CCGCGTGTGC ATAACAGCGG TCACTGGACA CAAAACGGTG CCAGCATCAG CCAGTTTGAG CTGCATCTGC GGGCGATTAC CGCTCTGCCG TTACCGCAAC CAGTAGTGAA TAATCCGTCG GTGATGATCA ATCTGATTGG TAGCGATGCG AATTATGACT GGCTGAAATT GCCGCTGGTG CATCTGCACT GGTACGACAA AGAAGTCCAT CCGGGGCGTA AAGTGGGGCA TCTGAATTTG ACCGACAGCG ACACATCGCG TCTGACTGCG ACGCTGGAAG CCTTAATCCC GCTGCTGCCG CCAGAGTATG CCAGCGGCGT GATTTGGGCG CAGAGTAAGT TCAGTTAA
|
Protein sequence | MKQVCVLGNG QLGRMLRQAG EPLGIAVWPV GLDAEPAAVP FQQSVITAEI ERWPETVLTR ELARHPAFVN RDVFPIIADR LTQKQLFDKL HLPTAPWQLL ADRSEWPAVF DRLGELAIVK RRTGGYDGRG QWRLRANETE QLPAECYGEC IVEQGINFSG EVSLVGARGF DGSTVFYPLT HNLHQDGILR TSVAFPQANA QQQAQAEEML SAIMQELGYV GVMTMECFVT PQGLLINELA PRVHNSGHWT QNGASISQFE LHLRAITALP LPQPVVNNPS VMINLIGSDA NYDWLKLPLV HLHWYDKEVH PGRKVGHLNL TDSDTSRLTA TLEALIPLLP PEYASGVIWA QSKFS
|
| |