Gene Sare_4197 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagSare_4197 
SymbolpurH 
ID5704197 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameSalinispora arenicola CNS-205 
KingdomBacteria 
Replicon accessionNC_009953 
Strand
Start bp4764574 
End bp4766142 
Gene Length1569 bp 
Protein Length522 aa 
Translation table11 
GC content71% 
IMG OID641273616 
Productbifunctional phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP cyclohydrolase 
Protein accessionYP_001538969 
Protein GI159039716 
COG category[F] Nucleotide transport and metabolism 
COG ID[COG0138] AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) 
TIGRFAM ID[TIGR00355] phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP cyclohydrolase 


Plasmid Coverage information

Num covering plasmid clones11 
Plasmid unclonability p-value0.114932 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones
Fosmid unclonability p-value0.0095682 
Fosmid HitchhikerYes 
Fosmid clonabilityhitchhiker 
 

Sequence

Gene sequence
GTGAGTTCCA CTCAGGACGA GCGCCGCCCG ATCCGGCGGG CGCTGGTCAG CGTCTACGAC 
AAGGCCGGTC TGGCCGAGCT GGCCCGGGCG TTGCACGACG CCGGCGTGGA GATCGTCTCG
ACCGGAAGCA CCGCGTCGGC CATCGCCGGT GCCGGGGTTC CGGTGACCGC GGTGGACTCG
GTGACCGGTT TTCCCGAGAT CCTCGACGGC CGGGTCAAGA CGCTGCACCC GAAGATCCAC
GGTGGTCTCC TCGCCGACCT GCGCAAGGAG TCACACGTCG GGCAGCTCAC CGAGCACGGC
ATCGGCACGA TTGACCTGCT GGTGTCCAAT CTGTACCCGT TCCAGGAGAC CGTTGCGTCC
GGTGCGGGGC AGGACGAGTG CGTCGAGCAG ATCGACATCG GCGGGCCGGC GATGGTGCGG
GCCGCCGCCA AGAACCACGC CTCGGTCGCC GTGGTGACCG ACCCGTCCGG GTACCCGCAG
CTGCTGGCGG CGGTACGGGC GGGCGGGTTC ACCCTCGCGC AGCGTCGGGC GCTCGCGGCC
CGCGCGTTTG CGGTGATCGC TGACTACGAC GTGGCCGTCG CCGAGTGGTG CGCGCGGGAG
CTGGTCGAGG ACGCGCCGTG GCCGAGTTTC GCCGGGCTGG CGCTACGCCG CGAGGCCGTG
TTGCGGTACG GCGAGAACCC GCACCAGGCG GCCGCTCTCT ACACCGATGC GTCCAGCCCG
GCCGGGCTGG CCCAGGCCGA GCAGTTGCAC GGCAAGGAGA TGTCGTACAA CAACTACCTG
GACGCCGATG CCGCCTGGCG GGCCGCCAAC GACTTTTCCG ACCAGCCGGC AGTGGCGATC
ATCAAGCACG CCAACCCGTG TGGCATAGCG GTGGGCGTAG ACGTTGCCGA CGCGCACCGC
AAGGCGCACG CCTGCGACCC GGTGTCGGCG TTCGGTGGCG TGATCGCCGT GAACCGACCG
GTTGGCGTCG AGCTCGCCCG GCAGGTGTCG GAGGTGTTCA CCGAGGTGGT TGTCGCGCCG
GAGTTCGAAC CCGGTGCGCT CGAGGTGCTG CAGGCCAAGA AGAACGTGCG GTTGCTGCGT
GCCCCGGCGT ATGCGCCGCC GACGGCGGAG TGGCGACCGG TCACCGGTGG CGTGCTGATG
CAGGTGCGGG ACCGGGTGGA CGCCGCGGGC GACGACCCAG CCGCCTGGCG GCTGGCGACC
GGCGAGGTAG CCGACGAGGC CACCCTGCGG GACCTGGCTT TCGCCTGGCG GGCGGTGCGG
GCGGTGAAGA GCAACGCGAT CCTGCTCGCC CGCGACGGCG CGACGGTCGG TGTGGGCATG
GGGCAGGTCA ACCGGGTCGA TTCGGCCCGG CTGGCGGTGG AGCGGGCCGG TGCCGAACGG
GCGCGTGGGG CGGTGTGCGC CTCCGACGCG TTCTTCCCGT TCGCCGACGG GCCGAAGATT
CTCTTCGACG CCGGAGTGCG GGCGATCGTC CAACCCGGTG GGTCGATCCG GGACGAGGAG
GTCATCGCCG CCGCCAAGGC GGCCGGCGTG ACCATGTACC TGACCGGTAC CCGTCACTTT
TTCCACTGA
 
Protein sequence
MSSTQDERRP IRRALVSVYD KAGLAELARA LHDAGVEIVS TGSTASAIAG AGVPVTAVDS 
VTGFPEILDG RVKTLHPKIH GGLLADLRKE SHVGQLTEHG IGTIDLLVSN LYPFQETVAS
GAGQDECVEQ IDIGGPAMVR AAAKNHASVA VVTDPSGYPQ LLAAVRAGGF TLAQRRALAA
RAFAVIADYD VAVAEWCARE LVEDAPWPSF AGLALRREAV LRYGENPHQA AALYTDASSP
AGLAQAEQLH GKEMSYNNYL DADAAWRAAN DFSDQPAVAI IKHANPCGIA VGVDVADAHR
KAHACDPVSA FGGVIAVNRP VGVELARQVS EVFTEVVVAP EFEPGALEVL QAKKNVRLLR
APAYAPPTAE WRPVTGGVLM QVRDRVDAAG DDPAAWRLAT GEVADEATLR DLAFAWRAVR
AVKSNAILLA RDGATVGVGM GQVNRVDSAR LAVERAGAER ARGAVCASDA FFPFADGPKI
LFDAGVRAIV QPGGSIRDEE VIAAAKAAGV TMYLTGTRHF FH