Gene EcSMS35_4455 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagEcSMS35_4455 
SymbolpurH 
ID6143023 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameEscherichia coli SMS-3-5 
KingdomBacteria 
Replicon accessionNC_010498 
Strand
Start bp4550447 
End bp4552036 
Gene Length1590 bp 
Protein Length529 aa 
Translation table11 
GC content56% 
IMG OID641619275 
Productbifunctional phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP cyclohydrolase 
Protein accessionYP_001746391 
Protein GI170680627 
COG category[F] Nucleotide transport and metabolism 
COG ID[COG0138] AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) 
TIGRFAM ID[TIGR00355] phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP cyclohydrolase 


Plasmid Coverage information

Num covering plasmid clones
Plasmid unclonability p-value0.00025024 
Plasmid hitchhikingNo 
Plasmid clonabilitydecreased coverage 
 

Fosmid Coverage information

Num covering fosmid clones28 
Fosmid unclonability p-value0.000830583 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
ATGCAACAAC GTCGTCCAGT CCGCCGCGCT CTGCTCAGTG TTTCTGACAA AGCCGGTATC 
GTCGAATTCG CCCAGGCACT TTCCGCACGC GGTGTGGAGC TGCTGTCTAC AGGGGGCACT
GCCCGTCTGT TAGCAGAAAA AGGTCTGCCG GTAACCGAAG TTTCCGATTA CACCGGTTTC
CCGGAGATGA TGGATGGACG CGTGAAGACC CTGCATCCGA AAGTACATGG TGGCATTCTG
GGCCGTCGCG GCCAGGACGA TACCATTATG GAAGAACATC AGATCCAGCC TATCGATATG
GTGGTTGTTA ACCTGTATCC GTTCGCCCAG ACCGTTGCTC GTGAAGGCTG CTCGCTGGAA
GATGCGGTTG AGAACATCGA TATCGGCGGC CCGACGATGG TGCGCTCTGC CGCCAAGAAC
CATAAAGATG TCGCCATCGT GGTGAAGAGC AGCGACTACG ACGCCATTAT TAAAGAGATA
GATGCCAATG AAGGCTCCCT GACTCTGGAA ACTCGCTTTG ACCTTGCCAT CAAAGCCTTC
GAACACACCG CCGCCTACGA CAGCATGATT GCCAACTACT TCGGCAGCAT GGTTCCGGCT
TACCACGGTG AAAGCAAAGA AGCCGCCGGT CGCTTCCCAC GCACGCTGAA CCTGAACTTC
ATTAAGAAGC AGGATATGCG TTACGGCGAG AACAGCCACC AGCAGGCTGC CTTCTATATA
GAAGAGAATG TCAAAGAAGC CTCCGTTGCT ACCGCAACCC AGGTTCAGGG TAAAGCCCTC
TCTTATAACA ACATCGCCGA TACCGATGCG GCGCTGGAGT GCGTGAAAGA GTTCGCCGAG
CCGGCATGTG TGATTGTGAA GCACGCCAAC CCTTGCGGCG TGGCTATCGG CAATTCCATT
CTTGATGCTT ACGATCGCGC GTACAAAACC GACCCGACCT CCGCATTCGG CGGCATTATC
GCCTTTAACC GCGAGCTGGA TGCGGAAACC GCGCAGGCCA TCATTTCTCG TCAGTTTGTT
GAAGTGATTA TTGCGCCGTC CGCCAGCGAA GAAGCCCTGA AAATCACCGC CGCCAAACAG
AACGTACGCG TCCTGACCTG CGGTCAGTGG GGCGAGCGTG TTCCGGGTCT TGATTTCAAA
CGCGTGAACG GCGGTCTGCT GGTTCAGGAT CGAGACCTGG GGATGGTCGG TGCAGAAGAA
CTGCGCGTCG TCACCAAACG TCAGCCGAGC GAACAGGAAC TGCGTGATGC GCTGTTCTGC
TGGAAAGTGG CGAAGTTCGT GAAATCCAAT GCTATCGTCT ATGCCAAAAA CAATATGACC
ATCGGTATTG GCGCAGGCCA GATGAGCCGT GTGTACTCCG CGAAAATCGC CGGTATTAAA
GCGGCCGATG AAGGCCTGGA AGTGAAAGGT TCCTCGATGG CTTCTGACGC ATTCTTCCCG
TTCCGCGACG GTATTGATGC CGCCGCCGCT GCGGGTGTGA CCTGTGTAAT CCAGCCTGGC
GGTTCTATCC GTGATGACGA AGTAATTGCC GCCGCCGACG AGCACGGTAT TGCAATGCTC
TTCACCGACA TGCGCCACTT CCGCCATTAA
 
Protein sequence
MQQRRPVRRA LLSVSDKAGI VEFAQALSAR GVELLSTGGT ARLLAEKGLP VTEVSDYTGF 
PEMMDGRVKT LHPKVHGGIL GRRGQDDTIM EEHQIQPIDM VVVNLYPFAQ TVAREGCSLE
DAVENIDIGG PTMVRSAAKN HKDVAIVVKS SDYDAIIKEI DANEGSLTLE TRFDLAIKAF
EHTAAYDSMI ANYFGSMVPA YHGESKEAAG RFPRTLNLNF IKKQDMRYGE NSHQQAAFYI
EENVKEASVA TATQVQGKAL SYNNIADTDA ALECVKEFAE PACVIVKHAN PCGVAIGNSI
LDAYDRAYKT DPTSAFGGII AFNRELDAET AQAIISRQFV EVIIAPSASE EALKITAAKQ
NVRVLTCGQW GERVPGLDFK RVNGGLLVQD RDLGMVGAEE LRVVTKRQPS EQELRDALFC
WKVAKFVKSN AIVYAKNNMT IGIGAGQMSR VYSAKIAGIK AADEGLEVKG SSMASDAFFP
FRDGIDAAAA AGVTCVIQPG GSIRDDEVIA AADEHGIAML FTDMRHFRH