Gene Francci3_1756 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagFrancci3_1756 
Symbol 
ID3906822 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameFrankia sp. CcI3 
KingdomBacteria 
Replicon accessionNC_007777 
Strand
Start bp2087130 
End bp2088554 
Gene Length1425 bp 
Protein Length474 aa 
Translation table11 
GC content66% 
IMG OID637879094 
Productsulfatase 
Protein accessionYP_480861 
Protein GI86740461 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones22 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones10 
Fosmid unclonability p-value0.242329 
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGACGGCTG CGTCCCGGTA TGATCACATT ATCCTCGTCT CGATAGATAC TCTGCGCAGC 
GACTGCGTCG GGGCCAGCCC GTTCCCACTG TGGCCCGCGA AGTACCCAGG ACTGCGCGCC
CCCGCAACTC CAGTACTTGA CCGTTTGGTG GCCGAGGGCG CCTACTTCCC GAACACGATC
TCAGCCGCAC CCTACACCGC CGCGTCCCAC GGTGCGATCC TGACCGGCCG GTTCCCGCTG
CACAACGGTT TGTTCGAATT CTACAACGGC AGGCTGGCCG GGCCGTCGGT GTTCGGCCAT
GGCCGAACTG CGGGCAGGCG CACCGTGATG AAGGTCGACT TCCCAATCAT CCTGGGCCCG
GAGCTGGGTT TCACCAACGA CGTGGACGTC TACCTGCGCG AAGACGACGA CCGGTTCATC
GAGGAGGTTG TCGAGGCGGA CTCGAGCATG TCGTTTGCGC ACTTCGGCGG TGTCCACCTG
CCCTATGGGT TCCACAACCT GCGTTTCGGT GGCGAAGCCT ACCGGGCTAA GATCACCGAG
CTGGAGGCTG AGCTGCCCGA GGACTTGCCG ATACCCACCG ACCAGATTAC CGAGACCTTC
CGCGAGGCCG AGGACCACCA ACTGTTCCTG CGGTACAAGC GGATCACCAA TCATCTCTAC
GACACCGGAG CCTACGATCG TCTGTTCGAG CTATACCTGG AGGGCATCGA GTTCTTCCTG
CGCGAGCGGT TCGAGCCGTT CCTGGAGCGG CTGCGGGCAA GGCTCGAGGC GAGCGGGGCC
AGCATGCTCG TGGTGATTTT CGGCGACCAT GGGCATGAGC TGGACGCGGA GTCCTACGGC
CACCACAACT CCCTGTCCGA AGGGGTATTG CGGGTGCCGC TGATCTTCTG GGGCGACGGT
GTGGCTCCCG GGCTGCACGC GCACAGGGTG CGCACGGTCG ATATCGTACC GACCGTGCTG
GAGCTAGCAG ATATCTCCCC TGCGCCCGGC TCGCCCGGGT TCGACGGCGA GACGCTGGCG
CCGGTGGTGC GCGGCGAACG CGCGTTGGAC GGCCACGCTA TCGCGTTCAG CCAGACCTAC
GGCGCCGATA CCCGGGAGTT CGTGGCCTTC CAGCAGCGGC AGCTACGCGG TGAGAGCCAG
GAGCCGCTCA GGCACGTGCT GCTCGGCGAG AGTGTGTACC TGGGTGACCG CAGGCTCGTG
CGTATGCATC ACCGCTATGG CAAGCAATTC CGGATCGAGG CAGTTGAGGA GCACTGGGTC
GAGCGATTCG GCGACGACCT GGTCCCCCGG CTGGACCCCG GCGCGGAGAC CGCGGATCTG
GTGGCGGCGC TGGCGGGCTA CAACGCCGCG CACAGCCCCG CCGAGCCGGT GCCCGCCAGC
ACCGCAATTC GCGGCCAGTT GCGCAGCCTC GGTTACAACA TCTGA
 
Protein sequence
MTAASRYDHI ILVSIDTLRS DCVGASPFPL WPAKYPGLRA PATPVLDRLV AEGAYFPNTI 
SAAPYTAASH GAILTGRFPL HNGLFEFYNG RLAGPSVFGH GRTAGRRTVM KVDFPIILGP
ELGFTNDVDV YLREDDDRFI EEVVEADSSM SFAHFGGVHL PYGFHNLRFG GEAYRAKITE
LEAELPEDLP IPTDQITETF REAEDHQLFL RYKRITNHLY DTGAYDRLFE LYLEGIEFFL
RERFEPFLER LRARLEASGA SMLVVIFGDH GHELDAESYG HHNSLSEGVL RVPLIFWGDG
VAPGLHAHRV RTVDIVPTVL ELADISPAPG SPGFDGETLA PVVRGERALD GHAIAFSQTY
GADTREFVAF QQRQLRGESQ EPLRHVLLGE SVYLGDRRLV RMHHRYGKQF RIEAVEEHWV
ERFGDDLVPR LDPGAETADL VAALAGYNAA HSPAEPVPAS TAIRGQLRSL GYNI