Gene Sare_2031 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagSare_2031 
Symbol 
ID5705685 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameSalinispora arenicola CNS-205 
KingdomBacteria 
Replicon accessionNC_009953 
Strand
Start bp2324552 
End bp2325793 
Gene Length1242 bp 
Protein Length413 aa 
Translation table11 
GC content69% 
IMG OID641271521 
Producttryptophan halogenase 
Protein accessionYP_001536892 
Protein GI159037639 
COG category[C] Energy production and conversion 
COG ID[COG0644] Dehydrogenases (flavoproteins) 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones12 
Plasmid unclonability p-value0.0818162 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones
Fosmid unclonability p-value0.0655371 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAGACAAC CCGTTGCCGA GTACGACGTC ATCGTCGTCG GTGGCGGCCC CGCAGGGGCG 
TGCACCGCCG GTCTTCTCGC GCTGGAGGGG CACCGGATCC TGTTGCTGGA GCGCGAGAAG
TTTCCCCGCT ACCACGTCGG CGAATCGCTG ATCACCGGCG TCGTCCCGAC GCTTACCGCG
TTGGGACTGC TGGACCGCAT GGCCGAGCTG CGGTTCCAGG TCAAGTACGG CGGCAGTCTC
CTGTGGGGCG AGAACCAGAC CGAACCGTGG TCGTTCCGCT TCCGGGAGAT CCGTGGCGGC
CGCTACGAGT ACGCCTGGCA GGTCCGGCGC GCCGAGTTCG ACGCGATGTT GCTGGACCGG
GCGCGGGAGC TGGGCGTGCA CGTCGTCGAG GGGGCGACCG TCCGGGACGC GCTGACCGAC
GGCGACCGGC TCGCGGGCGT GCGCTATCAG CTCAAGGGCG AGTCGGGCTC CGTGCCGGCG
CGGGCGACGA TGGTGGTCGA CGCCTCGGGC CAGCATCGCT GGTTGGGTCG CCGGTTCGGG
CTGGTCGACT GGTACGACGA CCTGCGCAAC GTGGCCGTGT GGAGCTACTG GCAGGGTGCC
CTGCGCTACC CGGGTGAGCA CGAGGGTGAC CTGCTGACCG AGAGCTGCCG GCAGGGCTGG
CTCTGGTACG CGCCGCTGAG CCCGGAGCTG ACGGGCATCG GCTACGTCAC GACCAGCGAT
CGGCTGGTGG CCTCTGGGTT AACGCCGGAG CAGTTGCTGG AAAGACACAT TGCGGAATCG
TCCGAGGTCT CCTGGCTCAC CGCGGGCGCG AAGCGGGTGG ACATCTATCG CGCCGCGCGC
GACTGGTCGT ACACCTGCCA GCAGTTCTCC GGCCCGGGCT GGGTCCTGGT CGGCGACGCG
GCCGCATTCA TCGACCCGCT GCTCTCCGCC GGAGTGACCC TGGCCATGCG CGCGGCGAGC
AGCGTGGCGA AGGCGGTCCA CGAGACGCTG ACCGCGCCGG ACAAGGAACG GCACGTCATG
AAGGACTACG AGGACCGGTA CCGGGACTTC CTCGGCTCAC TGCTGGAGTT GGTCCGGTTC
TTTTACGACG GCGCGCACGG CAAGGAGGAG CTGCACCTGC GAGCCCAGGC CATCGTGGAT
CCCGACCGCA GCCTGCCGCC CAAGCTCTCA TTCGTGTCGC TGCTCTCCGG GCTGGTCCGC
GGGGACGAAA GCCTCGGCGC GGACGCGGTC GACGAATATT GA
 
Protein sequence
MRQPVAEYDV IVVGGGPAGA CTAGLLALEG HRILLLEREK FPRYHVGESL ITGVVPTLTA 
LGLLDRMAEL RFQVKYGGSL LWGENQTEPW SFRFREIRGG RYEYAWQVRR AEFDAMLLDR
ARELGVHVVE GATVRDALTD GDRLAGVRYQ LKGESGSVPA RATMVVDASG QHRWLGRRFG
LVDWYDDLRN VAVWSYWQGA LRYPGEHEGD LLTESCRQGW LWYAPLSPEL TGIGYVTTSD
RLVASGLTPE QLLERHIAES SEVSWLTAGA KRVDIYRAAR DWSYTCQQFS GPGWVLVGDA
AAFIDPLLSA GVTLAMRAAS SVAKAVHETL TAPDKERHVM KDYEDRYRDF LGSLLELVRF
FYDGAHGKEE LHLRAQAIVD PDRSLPPKLS FVSLLSGLVR GDESLGADAV DEY