Gene Sare_3900 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagSare_3900 
Symbol 
ID5705838 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameSalinispora arenicola CNS-205 
KingdomBacteria 
Replicon accessionNC_009953 
Strand
Start bp4440446 
End bp4441501 
Gene Length1056 bp 
Protein Length351 aa 
Translation table11 
GC content70% 
IMG OID641273325 
Product4-hydroxy-2-ketovalerate aldolase 
Protein accessionYP_001538682 
Protein GI159039429 
COG category[E] Amino acid transport and metabolism 
COG ID[COG0119] Isopropylmalate/homocitrate/citramalate synthases 
TIGRFAM ID[TIGR03217] 4-hydroxy-2-oxovalerate aldolase 


Plasmid Coverage information

Num covering plasmid clones21 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones12 
Fosmid unclonability p-value0.577323 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGACCCGGC TCTACCTACA GGATGTGACG CTGCGGGACG GGATGCACGC CATCGCCCAC 
CGCTACACCG CTGACCAGGT ACGCACGATC GCCGCCGCGC TCGACGCCGC CGGGATAGCC
GCGATCGAGG TGGCGCACGG TGACGGGCTT GCCGGATCGA GTGTCAACTA CGGGCACGGC
GCGGCCAGTG ACGCGGACTG GATCGCGGCG GCGGCCGAGG TGCTGACCAC CGCGCGGCTG
ACCACGCTGC TGGTGCCCGG CATCGGCACC ATCGCCGATC TGAGGGCAGC GCGGGAACTT
GGCGTGACCA GCGTGCGGAT CGCCACCCAC TGCACCGAGG CCGACATCTC CGCCCAGCAC
ATCAGTTGGG CGCGGGAGAA CGGGATGGAC GTCTCCGGGT TTCTGATGAT GTCGCACATG
AACGACCCGG CGGGACTGGC GGCGCAGGCC AAGCTGATGG AGTCGTACGG GGCGCACTGC
GTCTACGTCA CCGACTCCGG CGGTCGGCTG CTGATGTCCG ACGTGGCCGA GCGGGTCGAC
GCGTACCGTC AGGTGCTCGA ACCAGAGACG CAGATCGGCA TTCACGCCCA CCACAACCTG
TCCCTCGGCG TGGCGAACAG CGTGATCGCC GTCGAACACG GCCGGATTCT CGGGGACGGG
CCGTTGGGCG CTCCGGCCGG CCGAACCGTC CGGGTGGACG CCTCGCTCGC CGGGCAGGGC
GCGGGCGCGG GTAATGCACC GCTCGAGGTC TTCGTCGCGG TCGCCGAGCT GCACGGCTGG
GAGCACGGCT GCGACGTGTT CGCGCTGATG GATGCGGCCG AGGATCTGGT CCGGCCGTTG
CAGGACCGAC CGGTGCGGGT TGATCGGGAG ACGCTCTCCC TGGGATACGC GGGCGTCTAC
TCCAGCTTCC TGCGGCACGC CGAGCGGGCC GCCGAACGCT ACGGCGTGGA CGTCCGCTCG
ATCCTGATCG AGCTGGGCCG GCGCCGGATG GTCGGTGGCC AGGAGGACAT GATCGTGGAC
GTGGCGCTCG ACCTGGCCGG CAGGGAGAAG ACGTGA
 
Protein sequence
MTRLYLQDVT LRDGMHAIAH RYTADQVRTI AAALDAAGIA AIEVAHGDGL AGSSVNYGHG 
AASDADWIAA AAEVLTTARL TTLLVPGIGT IADLRAAREL GVTSVRIATH CTEADISAQH
ISWARENGMD VSGFLMMSHM NDPAGLAAQA KLMESYGAHC VYVTDSGGRL LMSDVAERVD
AYRQVLEPET QIGIHAHHNL SLGVANSVIA VEHGRILGDG PLGAPAGRTV RVDASLAGQG
AGAGNAPLEV FVAVAELHGW EHGCDVFALM DAAEDLVRPL QDRPVRVDRE TLSLGYAGVY
SSFLRHAERA AERYGVDVRS ILIELGRRRM VGGQEDMIVD VALDLAGREK T