Gene Caul_3652 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCaul_3652 
Symbol 
ID5901107 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameCaulobacter sp. K31 
KingdomBacteria 
Replicon accessionNC_010338 
Strand
Start bp3940558 
End bp3941841 
Gene Length1284 bp 
Protein Length427 aa 
Translation table11 
GC content67% 
IMG OID641564163 
Producthomogentisate 1,2-dioxygenase 
Protein accessionYP_001685277 
Protein GI167647614 
COG category[Q] Secondary metabolites biosynthesis, transport and catabolism 
COG ID[COG3508] Homogentisate 1,2-dioxygenase 
TIGRFAM ID[TIGR01015] homogentisate 1,2-dioxygenase 


Plasmid Coverage information

Num covering plasmid clones32 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones11 
Fosmid unclonability p-value0.207248 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGGACCTCC AGTACCAATC CGGCTTCGCC AACCACTTCA GCACCGAGGC CGTCCCCGGC 
GCCCTGCCGG TGGGCCAAAA CTCGCCGCAG GCGCCGCCCT ACGGCCTCTA TGCCGAGCAA
CTGTCGGGCA CGGCCTTCAC CGCGCCGCGC CACGAGAACC GCAGAAGTTG GCTGTACCGC
CTGCGACCCA GCGCCGGCCA TGGACCCTAT GCGCCCTATG TCCAGGAGCG CCTGAAGAGC
GGCCCGTTCG GCGCCGCCGT CCCGACGCCC AATCGCCTGC GTTGGGATCC ACTGGAGATC
CCCGAAGCGC CGCTCGACTT CGTCGATGGT CTCGTCACCC TGGCCGGCAA CGGCGACGTG
GCGACCCAGG CCGGCATGGC CGCGCATCTG TATCTCGCCA ACCGCTCGAT GATCGACCGG
GTGTTCCAGA ACGCCGACGG CGAGCTGTTG ATCGTGCCCC AGCTGGGCGC CCTGCGTTTC
GTCACCGAGT TGGGCGTGAT CGACGCCGCT CCAGGCGAGG TCGTGGTCAT TCCGCGCGGC
GTGCGGTTCC GCGTCGAGCT TGAGGGGCCG GTTCGCGGCT ATGTCTGCGA GAACTATGGC
CCCATGTTCC GCCTGCCCGA ACTGGGACCG ATCGGCTCGA ACGGCCTGGC CAACAGCCGC
GATTTCCTGA CCCCCGTCGC CGCCTTCGAG GATGTTGAGC GCCCGACCGA GGTGATCCAG
AAGTTCCAGG GCGGCCTGTG GACGGGAACC TGGGACCACA GCCCGCTGGA CGTCGTCGCC
TGGCACGGCA ATCTGGCGCC CTACAAGTAC GACCTGGCGC GGTTCAATAC GATGGGCACG
GTCAGCTTCG ACCATCCCGA TCCGTCGATC TTCACGGTCC TCACCGCGCC CAGCGAGATC
CCGGGCACGG CCAATGTCGA TTTCGTGATC TTCCCGCCGC GCTGGATGGT GGCCGAGCAC
ACCTTCCGGC CGCCCTGGTT CCACCGCAAC GTGATGAGCG AGTTCATGGG GCTGGTCACC
GGCGCCTACG ACGCCAAGGC TGGCGGCTTC AGTCCGGGCG GGGCTTCCCT GCACAACATG
ATGAGCGACC ACGGTCCGGA CGTGGCCAGC CACAAGGCCG CCAGCGAGGC CGATTTGAGT
CCGCACAAGA TCGAGGCGAC CATGGCTTTC ATGTTCGAGA GCCGCTGGGT GATCCGTCCC
ACGAAATACG CTCTGGAGAC TTCTGAACTT CAGGCCGACT ATGACGCGTG CTGGACGGGC
TTTCCCAAGG CCAAGCTGCC TTAG
 
Protein sequence
MDLQYQSGFA NHFSTEAVPG ALPVGQNSPQ APPYGLYAEQ LSGTAFTAPR HENRRSWLYR 
LRPSAGHGPY APYVQERLKS GPFGAAVPTP NRLRWDPLEI PEAPLDFVDG LVTLAGNGDV
ATQAGMAAHL YLANRSMIDR VFQNADGELL IVPQLGALRF VTELGVIDAA PGEVVVIPRG
VRFRVELEGP VRGYVCENYG PMFRLPELGP IGSNGLANSR DFLTPVAAFE DVERPTEVIQ
KFQGGLWTGT WDHSPLDVVA WHGNLAPYKY DLARFNTMGT VSFDHPDPSI FTVLTAPSEI
PGTANVDFVI FPPRWMVAEH TFRPPWFHRN VMSEFMGLVT GAYDAKAGGF SPGGASLHNM
MSDHGPDVAS HKAASEADLS PHKIEATMAF MFESRWVIRP TKYALETSEL QADYDACWTG
FPKAKLP