Gene Franean1_4358 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagFranean1_4358 
Symbol 
ID5672713 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameFrankia sp. EAN1pec 
KingdomBacteria 
Replicon accessionNC_009921 
Strand
Start bp5202607 
End bp5204484 
Gene Length1878 bp 
Protein Length625 aa 
Translation table11 
GC content69% 
IMG OID641243231 
Productsulfatase 
Protein accessionYP_001508648 
Protein GI158316140 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones13 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones19 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
GTGGGGCTCC GAAATTCGGG TTTCGGAGCC CCACACGGCC CGCTCGACCG GTGGCCCCTC 
GGTCTGGGAT TCGAGCGCTA CTACGGGTTC CTGCAGGGCG ACACCAACCA CTGGACACCC
AACCTGGTGT CGGACAACCA GTACATCGAC CCGCCGGCAC GGCCCGAAGA CGGCCACCAT
CTCACCGAGG ATCTCGCGGA CACCGCCATC CGGATGGTGA CCGACCAGCA GCACGCTGCC
CCCGGTAAGC CGTTCCTCCT CTACTTCTCC CTGGGCGCGA TGCACGCACC CCACCAGGTC
GCGCCGGAGT GGGTGGATCC CTACCGTGGA CAGTTCGACC TGGGCTGGGA CCGTTGGCGC
GAGAACGTCT TCGCCCGTCA GGTCGAGGCC GGCGTCGTGC CGGGCGGAAC CGTCCTCACC
GAACGGCCGT CCTGGGTGGC GGGATGGAAC GGGCTGACGC CGGACGCCCG GCGCATGCAC
GCCCGGGCCC ACGAGGTGTA CGCAGGCTTC CTCACCCACG CCGACGCCCA GATCGGACGT
CTCCTCGACG CACTCGGACG CCTCGGCGCA CTCGACAACA CCATCGTGCT ACTGATGTCA
GACAACGGCG CCAGCGCCGA GGGCGGCCAG CACGGAACCT TCAACGAGCA CCGCTTCTCC
TCCCGGCTCC CGGAAACGGT CGAGAACAAC CTCGCCCATC TCGACGACTG GGGAGGTTCT
CGCACCTACA GCCACTACTC GTGGGGATGG GCGTGGGCGG GCAACACACC GCTGCGACTG
TGGAAGCGCT ACACCTGGCT CGGCGGCACC CGGGTGCCGC TGGTCGTCCA CTGGCCCGCC
GGAATCACTG CCCGAGGGGA GAACCGCGAC CAGTTCTGCC ACGCCGTCGA TGTGCTGCCC
ACCATTCTCG ACGCCTGCGG CATACCGGCA CCGGAGACGG TGGATGGCGT GTCCCAGCAG
CCGCTGGACG GCGCCAGCTT CACGGCGAGC TTCCACGACG CCGACGCGCC GGCCCCGCGA
TCCACCCAGT ACTTCGAGAT TCTCGGCTCC CGCTCGATCG TCAGTGGCCG GTGGAAGGCG
ACCACCGACC ATGTGTCGAA AGGCGTGGTC GACGAGGAGG AGCTCATGAG CGGTAGCCGT
GACTTCGTCG CCGACCACTG GTCGCTGTTC GACATCGAGC AGGACTTCTC CGAGGCGGTG
GACCTCTCCG CCGAGCACCC CGACGTCGTC CGCCGACTGC GGGAGCTGTG GCTCCTCGAG
GCCGGGCGCA ACAACGTCAT GCCGATGTCG GAGGGCCTCA CCGACCGAGT CGGAGCGATC
ATTCCCCGGG ACTACCCGAT CGGCAACCGA GCCGTGTTCC GGCCGGGCGG CTCGCCGATC
TCCGAGGAGG CACTCCCGAT GCTGCCCTTT GGTTTCCGGA TGTCCGTCGA CACCGAGGTG
GCACCCGAAG CCGAAGGGGT GCTCTTCGCC CTCGGAGACT GGAACGGCGG GTACGCGCTC
TACGCGGTCG GGAGCATGCT CCGTTTCACA TACTGCCCTG CCGGTGAGCC GGTCACGGTG
GCGGCGGCGC GTGCGCTCCC CCCGGGGCTC CACCGCCTGT CCGTCGCCTT TGAGCCCGGC
GGGAGCCCGG GGGGCAGGTT CACCCTCACC TGCGACGACG AGGTGCTCGG CGCTGCCACC
ACGGCCGTCG CGATGCCCTT CGCCCTACAG CACGGTGGCA CGCACCTCTG CCTGGGCCGC
GACCGCGGTT TCCCGGTATC CGAGGACTAC ACGTCTCCGT TCCCGTGGAA CGGAGTGATC
CACAGGATGG TCGTGGAGAC ACCCGGGTTC GACCGGCCAG GAACCACCGA GGTACGAGCA
GCTCTCCACG CGGACTGA
 
Protein sequence
MGLRNSGFGA PHGPLDRWPL GLGFERYYGF LQGDTNHWTP NLVSDNQYID PPARPEDGHH 
LTEDLADTAI RMVTDQQHAA PGKPFLLYFS LGAMHAPHQV APEWVDPYRG QFDLGWDRWR
ENVFARQVEA GVVPGGTVLT ERPSWVAGWN GLTPDARRMH ARAHEVYAGF LTHADAQIGR
LLDALGRLGA LDNTIVLLMS DNGASAEGGQ HGTFNEHRFS SRLPETVENN LAHLDDWGGS
RTYSHYSWGW AWAGNTPLRL WKRYTWLGGT RVPLVVHWPA GITARGENRD QFCHAVDVLP
TILDACGIPA PETVDGVSQQ PLDGASFTAS FHDADAPAPR STQYFEILGS RSIVSGRWKA
TTDHVSKGVV DEEELMSGSR DFVADHWSLF DIEQDFSEAV DLSAEHPDVV RRLRELWLLE
AGRNNVMPMS EGLTDRVGAI IPRDYPIGNR AVFRPGGSPI SEEALPMLPF GFRMSVDTEV
APEAEGVLFA LGDWNGGYAL YAVGSMLRFT YCPAGEPVTV AAARALPPGL HRLSVAFEPG
GSPGGRFTLT CDDEVLGAAT TAVAMPFALQ HGGTHLCLGR DRGFPVSEDY TSPFPWNGVI
HRMVVETPGF DRPGTTEVRA ALHAD