Gene Franean1_3787 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagFranean1_3787 
Symbol 
ID5672151 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameFrankia sp. EAN1pec 
KingdomBacteria 
Replicon accessionNC_009921 
Strand
Start bp4490401 
End bp4491624 
Gene Length1224 bp 
Protein Length407 aa 
Translation table11 
GC content72% 
IMG OID641242666 
Producthomogentisate 12-dioxygenase 
Protein accessionYP_001508086 
Protein GI158315578 
COG category[Q] Secondary metabolites biosynthesis, transport and catabolism 
COG ID[COG3508] Homogentisate 1,2-dioxygenase 
TIGRFAM ID[TIGR01015] homogentisate 1,2-dioxygenase 


Plasmid Coverage information

Num covering plasmid clones15 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones14 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGCCGTACT ACGCCCGGAC CGGGGACGTC CCGCCCAAAC GGCACACGCA GCACCGCGAC 
GGCGACGGTG GCCTGTACAG CGAGGAGCTG ATGGGATCGG AGGGCTTCTC CGCGGACTCG
TCACTGCTGT ACCACCGGGA GATCCCGGCG GCGATCAGCG CGGCCCGGCC GTGGGAGCTG
CCCGCCCTCG GCACCGTCGC CAACCACCCG CTGCGGCCGC GGCACCTGCG CACCCACACC
CTCACCGCGG CCGAGGGATG GCGGAGCCTC GACGTCGTCA CCGGGCGGCG GCTGCTCCTG
GGCAACACCG ATGTCCGGCT CTGCTACGTC GCCGCCGGCG CGCCTTCACC GCTCTACCGC
AACGGCACCG GTGACGAGTG CGTCTACGTC GAGGCCGGCA CCGCCCGCGT CGACACCGTG
TTCGGCACAC TGCCGGCCGG CCCCGGCGAC TACGTCGTCG TCCCGTGCGG CACGACGCAT
CGGTGGACGC CGAGCGGCGA CGAGCCGCTG CGCGCCTACA TCATCGAGGC GAACAGCCAC
ATCCGCCCGC CGAAGCGCTA CCTGTCAGCC TCGGGCCAGT TCCTCGAGCA CGCCCCCTAC
TGCGAACGCG ACCTGCGTCG CCCGGCCGGG CCGCTGCTCG AGGAGGGAAC GGACGTCGAG
GTGTACGTGA AGCACCGCGG GCCCGGGGCC GGGACCGGCG GCGTCGCCGG CACGGTGCTG
ACCTACCGGA CCCACCCGTT CGATGTCGTC GGCTGGGACG GCTGCCTCTA CCCGTACCTG
TTCAACATCG GCGACTTCGA GCCGATCACC GGCCGGCTGC ACCAGCCACC GCCGGTGCAC
CAGGTCTTCG AGGGCCGTAA CTTCGTCGTC TGCAACTTCG TGCCCCGCAA GGTCGACTAC
CACCCCGACG CGATCCCGAC GCCGTACTAC CACGCCAACA TCGACTCCGA CGAAGTGATC
TTCTACACCG GCGGTGCCTA CGGAGCTCGC CGCGGCTCCG GGATCGGGCC CGGCTCCGTC
TCGCTGCATC CCGCCGGGCA CACCCACGGC CCGCAGCCCG GGGCGGTGCA GGCGAGCATC
GGGGTGGAGT TCGTCGACGA GCTCGCGGTC ATGGTCGACA CCTTCGCCCC GCTGGAACTC
GGCGAGGGCG GGCTTGCCTG CGAGGACCCC GACTACGCCT GGACGTGGGC GGCGCACGCG
GCGGCGCGGG GCGGCGCAGG GTGA
 
Protein sequence
MPYYARTGDV PPKRHTQHRD GDGGLYSEEL MGSEGFSADS SLLYHREIPA AISAARPWEL 
PALGTVANHP LRPRHLRTHT LTAAEGWRSL DVVTGRRLLL GNTDVRLCYV AAGAPSPLYR
NGTGDECVYV EAGTARVDTV FGTLPAGPGD YVVVPCGTTH RWTPSGDEPL RAYIIEANSH
IRPPKRYLSA SGQFLEHAPY CERDLRRPAG PLLEEGTDVE VYVKHRGPGA GTGGVAGTVL
TYRTHPFDVV GWDGCLYPYL FNIGDFEPIT GRLHQPPPVH QVFEGRNFVV CNFVPRKVDY
HPDAIPTPYY HANIDSDEVI FYTGGAYGAR RGSGIGPGSV SLHPAGHTHG PQPGAVQASI
GVEFVDELAV MVDTFAPLEL GEGGLACEDP DYAWTWAAHA AARGGAG