Gene Francci3_3020 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagFrancci3_3020 
Symbol 
ID3904373 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameFrankia sp. CcI3 
KingdomBacteria 
Replicon accessionNC_007777 
Strand
Start bp3585978 
End bp3587555 
Gene Length1578 bp 
Protein Length525 aa 
Translation table11 
GC content71% 
IMG OID637880340 
Productanthranilate synthase component I 
Protein accessionYP_482106 
Protein GI86741706 
COG category[E] Amino acid transport and metabolism
[H] Coenzyme transport and metabolism 
COG ID[COG0147] Anthranilate/para-aminobenzoate synthases component I 
TIGRFAM ID[TIGR00564] anthranilate synthase component I, non-proteobacterial lineages 


Plasmid Coverage information

Num covering plasmid clones16 
Plasmid unclonability p-value0.730737 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones
Fosmid unclonability p-value0.0792915 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGACGACCG GGCAGATCAT CCCCGGCCGG GAGGAGTTCC GGGCCGCGGC GGCGGCGCAT 
CCGGTCGTCG CCGTCACCCG TCGGCTGCTC GCCGACGGGG AGACACCCGT CGGCGTCTAC
CGCAAGCTCG CCGGCGGCCC CGGGACGTTC CTGCTGGAAT CGGCGGAGCA TGGCGGGGTG
TGGTCGCGGT ATTCGTTCAT CGGTGTGCGG GCCGCGGCGA CGCTGACCGA GCGCGACGGG
CAGGCGGTGT GGGTGGACGG TGCGCCGCCT CCCGGAGTGC CGGTCGACGG TGACCCGCTC
GACATCCTGC GTGCGGTGGA GAGCACTCTG CAGGCCTCGA ACAGCCCACA GGCCCCTCCG
CTCATGGGTG GTCTGGTCGG CTATCTCGGC TACGACATCG TGCGCCGCAT CGAGCGGCTC
CCCGCGCACG CCGTCGACGA CCTCGGCCTT CCCGAGTTGC GCATGCTGCT CACCACCGAT
CTCGCCGTGC TGGACCATCG GGATGGTTCC TGCCTGCTGG TGGCCAACGT CTTCACCGGC
CGCGGGAATG GTGCCGGTGA CGCAGGCGCC ACCGGCACCA CCGGTGGGGA CGCGGACGCC
GCCGCACACC CGGCCGGGAT CGACCTGGAC GCCGCCTACG ACGACGCGGT GGCCCGGCTG
GAGGCCATGA CCAGCGATCT CGCGAAGTGG AACGAGCCGA CCGTGGCGAC CACGGCGGGC
ACCGCCGCGG GGGTCGGCGA CTTCGTCTCC GCCACGCCTC CCGGTGCCTT CCAGACCGCG
GTCGAGCGGG CCATCGAGGA GATCCGCGCC GGCGAGTGCT TCCAGATCGT CGTCTCGCAG
CGCTTCGAAC GGCGGACGAC CGCGGATGCG CTCGACGTCT ACCGGGTGCT GCGGACCTCG
AATCCGAGTC CCTATATGTA CCTGCTGCGT TTCGCCGACC ACGACGTCGT CGGATCCTCC
CCGGAGGCGC ACGTCAAGGT CACCGGCCGC CGGGCGTTGC TGCACCCCAT CGCGGGCAGC
CGGCCCCGGG GAGGCACTCC CGAACAGGAC GCCGAGCTTG CGGCGGAGCT TCTCGCGGAT
CCGAAGGAAC GCTCCGAGCA CGTGATGCTG GTCGACCTGG TCCGCAACGA CCTCGGGCGG
GTCTGCAACC CCGGTTCGGT CCGGGTCGTC GAGTTCGCGG CCATCGAACG GTTCTCGCAC
ATCATGCACA TTGTCTCCAC GGTGATCGGT GAGGTGGCTC CGGACCGCAG CGCCGTGGAC
GTGCTGGCCG CGACGTTTCC GGCGGGGACG CTCTCGGGGG CGCCCAAGGT GCGGGCGATG
GAGGTCATCG ACGAGCTGGA GCCGACCCGG CGGGGCCTGT ACGGGGGCGT GGTCGGCTAC
CTGGACTTCG GCGGTGATCT GGACACCGCG ATCGCGATCC GCACCACAGT GATGCGCGAC
GGTATGGCCT ACGTTCAGGC CGGAGCCGGC ATCGTCGCGG ATTCCGATCC CGACACCGAG
GATCTGGAAA GCCGCGCGAA GGCCGCCGCC GTCCTGCGCG CCGTCGAGGT CGCCGAGTCG
TTGCGACCGC CGTCATGA
 
Protein sequence
MTTGQIIPGR EEFRAAAAAH PVVAVTRRLL ADGETPVGVY RKLAGGPGTF LLESAEHGGV 
WSRYSFIGVR AAATLTERDG QAVWVDGAPP PGVPVDGDPL DILRAVESTL QASNSPQAPP
LMGGLVGYLG YDIVRRIERL PAHAVDDLGL PELRMLLTTD LAVLDHRDGS CLLVANVFTG
RGNGAGDAGA TGTTGGDADA AAHPAGIDLD AAYDDAVARL EAMTSDLAKW NEPTVATTAG
TAAGVGDFVS ATPPGAFQTA VERAIEEIRA GECFQIVVSQ RFERRTTADA LDVYRVLRTS
NPSPYMYLLR FADHDVVGSS PEAHVKVTGR RALLHPIAGS RPRGGTPEQD AELAAELLAD
PKERSEHVML VDLVRNDLGR VCNPGSVRVV EFAAIERFSH IMHIVSTVIG EVAPDRSAVD
VLAATFPAGT LSGAPKVRAM EVIDELEPTR RGLYGGVVGY LDFGGDLDTA IAIRTTVMRD
GMAYVQAGAG IVADSDPDTE DLESRAKAAA VLRAVEVAES LRPPS