Gene Nmul_A0133 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagNmul_A0133 
SymbolpurH 
ID3785781 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameNitrosospira multiformis ATCC 25196 
KingdomBacteria 
Replicon accessionNC_007614 
Strand
Start bp137927 
End bp139489 
Gene Length1563 bp 
Protein Length520 aa 
Translation table11 
GC content59% 
IMG OID637810203 
Productbifunctional phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP cyclohydrolase 
Protein accessionYP_410834 
Protein GI82701268 
COG category[F] Nucleotide transport and metabolism 
COG ID[COG0138] AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) 
TIGRFAM ID[TIGR00355] phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP cyclohydrolase 


Plasmid Coverage information

Num covering plasmid clones15 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clonesn/a 
Fosmid unclonability p-valuen/a 
Fosmid Hitchhikern/a 
Fosmid clonabilityn/a 
 

Sequence

Gene sequence
ATGTCGATTA AACAGGCCCT GATCAGTGTT TCAGATAAAA GCGGTATTGT CGAGTTTGCT 
CAAGCGCTGC ACAAACTTGG AGTAACCATT CTGTCAACGG GCGGCACCGC CAGACTTTTG
AGGGATGCCG GCGTTGCCGT GACCGAGGTC GGGAGCTACA CCGGCTTTCC CGAGATGCTG
GATGGGCGGG TCAAGACCCT GCATCCAAAA ATACACGCGG GAATTCTGGC CAGACGGGAT
TTGTCCGAGC ATGTGTCCGC GCTGGAAAAG GCGGAAATTC CTGCCATCGA TCTGGTAGTA
GTGAATCTCT ATCCTTTCAG CCAGACAGTG GCGCAGCCGG ATTGCAGTCT GGAAGAGGCC
ATTGAAAATA TCGATATCGG CGGCCCGACC ATGGTGCGCG CTGCGGCCAA GAACTACCAG
AGCGTGGCCG TAGTCACCGA TCCGGCGGAT TACCCCGCGT TACTTGACGA GATGAAGACT
GCCGGCGGCG AGGTTACGCC TGAGTTCCGG TTCCGGCTGG CGTGCAAGGC GTTTTCGCAT
ACAGCCGCCT ATGATGGGGC GATCAGCAAC TACCTCACCT CCATCGAGGG GGAAAATGCA
CAACGTCGTA CCTTTCCGGA ACGCCTGAAT CTCAATTTCA GCCTGGTGCA GCCTCTGCGC
TACGGGGAAA ACCCGCATCA GCAGGCCGCC TTTTACCGCG ATCCGCAACT CGTTCCCGGC
AGCCTTGCAA GCTACAGGCA GTTGCAGGGC AAGGAGCTCT CTTACAACAA CATCGCGGAC
GCGGATGCCG CCTGGGAATG CGTAAAGACG TTTGATTCCC CGGCCTGTGT CATCATCAAG
CATGCCAATC CTTGCGGCGT GGCAATCAGC GATTCGCCGC TCGCGGCTTA CAAGCTTGCA
TTTGCCACCG ATCCCACTTC CGCATTCGGC GGCATTATCG CTTTCAATCG CACGCTGGAC
GCCTCGGCCG CCGAGGCGGT GATGAGCCAG TTTGTCGAAG TGATCATCGC GCCGCAAATG
ACCGACGAGG CAAGGCAGAT GCTCGCGCGC AAAGCCAATG TGCGCGTGCT GACCGTGCCG
CTCCAGGCGG GGAACAACGC CCACGACTTC AAGCGAGTGG GTGGAGGATT GCTGGTGCAG
ACTCCGGACA ATCTCAACGT TACCCCTAAC CAATTGAAGG TGGTGACCGA AGTCCAGCCA
ACAGCGCAGC AATTGCAGGA CCTGCTGTTT GCCTGGCGGG TGGCGAAATT CGTCAAATCG
AATGCCATCG TCTTTTGTGC CAACGGCCGC ACGCTTGGCG TGGGCGCCGG ACAGATGAGC
AGGGTGGATA GCGCGCGCAT TGCCTCCATC AAAGCCGGGA ACGCGAACCT CACCCTGGCA
GGCTCGGTAG TGGCATCCGA TGCCTTCTTC CCTTTCCGCG ACGGACTGGA TGTCGTCGTC
CAGGCGGGGG CGGTGGCGGT CATCCAGCCG GGGGGCAGTG TGCGGGATGA AGAGGTTATC
GCTGCGGCGG ATGAACAAGG GGTGGCAATG GTATTTACCG GCGTGCGCCA TTTCAGGCAT
TGA
 
Protein sequence
MSIKQALISV SDKSGIVEFA QALHKLGVTI LSTGGTARLL RDAGVAVTEV GSYTGFPEML 
DGRVKTLHPK IHAGILARRD LSEHVSALEK AEIPAIDLVV VNLYPFSQTV AQPDCSLEEA
IENIDIGGPT MVRAAAKNYQ SVAVVTDPAD YPALLDEMKT AGGEVTPEFR FRLACKAFSH
TAAYDGAISN YLTSIEGENA QRRTFPERLN LNFSLVQPLR YGENPHQQAA FYRDPQLVPG
SLASYRQLQG KELSYNNIAD ADAAWECVKT FDSPACVIIK HANPCGVAIS DSPLAAYKLA
FATDPTSAFG GIIAFNRTLD ASAAEAVMSQ FVEVIIAPQM TDEARQMLAR KANVRVLTVP
LQAGNNAHDF KRVGGGLLVQ TPDNLNVTPN QLKVVTEVQP TAQQLQDLLF AWRVAKFVKS
NAIVFCANGR TLGVGAGQMS RVDSARIASI KAGNANLTLA GSVVASDAFF PFRDGLDVVV
QAGAVAVIQP GGSVRDEEVI AAADEQGVAM VFTGVRHFRH