Gene Acid345_2005 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagAcid345_2005 
Symbol 
ID4070911 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameCandidatus Koribacter versatilis Ellin345 
KingdomBacteria 
Replicon accessionNC_008009 
Strand
Start bp2404119 
End bp2405408 
Gene Length1290 bp 
Protein Length429 aa 
Translation table11 
GC content58% 
IMG OID637984019 
Producthomogentisate 1,2-dioxygenase 
Protein accessionYP_591080 
Protein GI94969032 
COG category[Q] Secondary metabolites biosynthesis, transport and catabolism 
COG ID[COG3508] Homogentisate 1,2-dioxygenase 
TIGRFAM ID[TIGR01015] homogentisate 1,2-dioxygenase 


Plasmid Coverage information

Num covering plasmid clones14 
Plasmid unclonability p-value0.436695 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones16 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAACGATC GATACATGTC GGGTTTCGGC AACGAGTTCG CGACCGAAGC TGAACCGGGT 
GCGCTGCCAA AGGGGCAGAA CTCGCCGCAA AAGGCGCCGC TCGGGTTGTA CACCGAGCAA
TTGAGTGGAA CGCCGTTCAC AGCGCCACGA TTGGCGAACC GCCGCACGTG GGCGTATCGC
ATCCGCCCTT CGGTGATGCA CAAGCCCTAC GAGGCTGCCG AGCAGAAGCT GCTACGCAGC
ACACCGTTCG ATGAAGCTCC CACGCCGCCG AACCAAATGC GCTGGGATCC TCCGCCTATT
CCGACGGAAC CAACGGACTT TGTAGACGGC CTCATCACGA TCGCGGGCAA TGGCGATTCC
ACCATGCACA GCGGCGTCGC GATTCATCTG TACGTCGCGA ACAAGAGCAT GAACGACCGC
TTCTTCTACG ACGCCGACGG CGAATTGCTC ATCGTGCCGC AGATGGGGAG GCTCATCTTC
CACACGGAAC TCGGGCGAAT TGAAGCTGCG CCTGGAGAGA TCGTTGTTAT CCAGCGCGGT
ATCAAGTTCC GCGTTGAATT ATTGGAGAAG CAGGCGCGGG GCTACATTTG CGAGAACTAT
GGGGCAATGT TCCGCCTGCC GGACCTCGGT CCGATCGGCG CCAATGGCTT GGCGAATACG
CGCGATTTCC TGACGCCCGT TGCTTCGTAC GAAGATCGCG AAGGCAGCTT CCGCGTGACC
TCCAAGTTTT GCGGCAAGCT GTGGGAAGGG GAGTACAACC ACTCGCCTCT CAACGTCGTC
GCGTGGCACG GGAATTACGC ACCATACAAG TACGATTTGG CAAACTTCAA TTGCATTAAC
TCCGTCACGT TCGATCATCC AGATCCATCG ATTTACACCG TGCTCACGGC GCCCTCGGAG
ATTCCGGGCA CAGCAAACTG CGACTTCGTG ATCTTCCCGC CACGCTGGAT GGTCGCCGAG
CACACTTTCC GGCCGCCATG GTTCCATCGC AACTTTATGA ACGAGTTCAT GGGCCTGATC
AAGGGCGAGT ACGACGCGAA GGCGGAGGGC TTTGTCCCCG GTGGCGCCAG CCTGCACAAC
TGCATGAGTG GCCACGGGCC TGATGCAGCG ACCTACGAGA AGGCGAGCCA CGCGGAGTTG
AAGCCGCAGT ATCTCGGCGA CACGCTAGCT TTCATGTTCG AAACGCGTTT TGTTTGCCGG
CCAACGAAAT ATGCGCTCGA AACAGCACAA CTGCAGCACG AGTACTACAC GTGCTGGCAG
GACCTAAAAA AACACTTTCG AAAACCCTAG
 
Protein sequence
MNDRYMSGFG NEFATEAEPG ALPKGQNSPQ KAPLGLYTEQ LSGTPFTAPR LANRRTWAYR 
IRPSVMHKPY EAAEQKLLRS TPFDEAPTPP NQMRWDPPPI PTEPTDFVDG LITIAGNGDS
TMHSGVAIHL YVANKSMNDR FFYDADGELL IVPQMGRLIF HTELGRIEAA PGEIVVIQRG
IKFRVELLEK QARGYICENY GAMFRLPDLG PIGANGLANT RDFLTPVASY EDREGSFRVT
SKFCGKLWEG EYNHSPLNVV AWHGNYAPYK YDLANFNCIN SVTFDHPDPS IYTVLTAPSE
IPGTANCDFV IFPPRWMVAE HTFRPPWFHR NFMNEFMGLI KGEYDAKAEG FVPGGASLHN
CMSGHGPDAA TYEKASHAEL KPQYLGDTLA FMFETRFVCR PTKYALETAQ LQHEYYTCWQ
DLKKHFRKP