Gene Amuc_1033 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagAmuc_1033 
Symbol 
ID6274082 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameAkkermansia muciniphila ATCC BAA-835 
KingdomBacteria 
Replicon accessionNC_010655 
Strand
Start bp1226293 
End bp1228038 
Gene Length1746 bp 
Protein Length581 aa 
Translation table11 
GC content56% 
IMG OID642613082 
Productsulfatase 
Protein accessionYP_001877640 
Protein GI187735528 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones20 
Plasmid unclonability p-value
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones73 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGAACCGTC ATGCCGCCAC CGCCCTGATG CTTGCGGCCT GTTCCCTGTC CGCTTCCGCG 
GACCAGCCTC AAAAGCAGAC GCCGGACCAG CGCCCCAACA TCGTGGTCAT TGTCACCGAC
GACCATTCCT ACCAGACCCT GGGCACCTGT GAAAAGGATT CTCCCATGCC TTATCCGAAC
TTCCGCAAAC TGGCGGACGA AGGCATGGTC TTTGACCGGA GCTACTGCGC CAACTCCCTG
TGCGGACCTT CGCGGGCCTG TATTTACACT GGCCGCCATT CCCACATGAA CGGGTACCTC
TTCAACGAAC ATGCGGCTCC CTTTGACGGT TCCCAGCCCA CTTTCCCGAA AATGCTCCAA
AAGGCCGGTT ACCAGACGGC TATTGTCGGC AAGTGGCACC TGGAAGCCAT TCCGCCGGGC
GCCAAGGGAG ATACGTCCAA ATATGAATCC GACCCCACCG GATTCGATTA CTGGGAAATT
TTCCCCGGCC AGGGCAACTA TTTCAATCCG GATTTCATCA CTCCCGGCAA GGACGGCAAA
CGCGTGGTGA AAACGGAGCC CGGCTATGCC ACGGAACTGG TTACGCAAAA AAGCCTCAAA
TGGCTGGACC AGAGGGACAA GAACAAACCC TTCATGCTCG TCGTGGGCCA CAAGGCACCC
CACCGTTGCT GGTGCCCCTC CATTCAGAAT CTGGGCCGCG CCAAACAGTA TGCGGACGCC
ATTGACCCGC CCGCCAATCT GGAAGACGAT TTTGCAGACC GCCCGGAATT CCTGAAAATG
ACGGAACAAA CCCTGCTCAA CCATTTCAAC GTATGGTCTG ACGAACACCT GATCAAGGAG
GTGGTCCCCG AAGACATCCA GAAAATGCTT TCCTGCCCGG AATCCAAGAC CCTGCATACT
CAGTATGACT GGGAAATGCC GGAATGGGTG CGCATGGACC CGCAGCAGAA GGAAGCCTGG
TACAACTACC ACAAGGCCCG TACCGTACAG CTTGTCAAAG ATATTAAAAA CGGGAAAATC
AAAACGCAGC GCGACATTTT GCTGCGCCGC TGGCGCCATT ATATGGAAGA CTATCTGGGC
ACCGTTCTTT CCGTGGACGA AAGCATCGGC CAGATCATGG ACTATCTGAA ACAAAACGGT
CTGGACAGGA ATACGCTGGT GCTCTACTGC GGAGACCAGG GATTCTACAT GGGCGAACAC
GGCCTGTACG ACAAACGCTG GATTTTCGAA GAATCCTTCC GCATGCCCCT CATCATGAGA
TGGCCGGGCC ACATCAGGCC GGGCGTGCGC TCCTCCGCCA TGGTGCAGGA ACTGGATTAT
GCTCCCACTT TCTGCGACGT GGCCGGGGTA AATACCAAGG AAAATATGAA TACCTTCCAG
GGCCGCAGCC TCACTCCCCT GTTCAAGACC GGAGAACATC AGGATTTCAA AAACCGTTCT
CTTTACTACG CCTTTTACGA AAATCCGGGT GAACACAACG CTCCGCGCCA TGATGGCCTG
CGCACGGACC GCTACACGCT GTCCTATATC TGGACCAGCG ACGAATGGAT GCTCTTTGAC
AACCAGAAGG ACCCGGCCCA AATGCACAAC GTCATCAACA AACCGGAATA TGCGGAAACC
GTGAAAGAAC TCAAAGCCCT GTACGGCAAG CTCCGCAAAG ACTACCAGGT ACCGGAAGGC
TTCCCCGGGG CCACCGGCAA ACTGGCCGTC AAGCCGCAGT GGGACTGCGC TCCCTCCAGA
GATTGA
 
Protein sequence
MNRHAATALM LAACSLSASA DQPQKQTPDQ RPNIVVIVTD DHSYQTLGTC EKDSPMPYPN 
FRKLADEGMV FDRSYCANSL CGPSRACIYT GRHSHMNGYL FNEHAAPFDG SQPTFPKMLQ
KAGYQTAIVG KWHLEAIPPG AKGDTSKYES DPTGFDYWEI FPGQGNYFNP DFITPGKDGK
RVVKTEPGYA TELVTQKSLK WLDQRDKNKP FMLVVGHKAP HRCWCPSIQN LGRAKQYADA
IDPPANLEDD FADRPEFLKM TEQTLLNHFN VWSDEHLIKE VVPEDIQKML SCPESKTLHT
QYDWEMPEWV RMDPQQKEAW YNYHKARTVQ LVKDIKNGKI KTQRDILLRR WRHYMEDYLG
TVLSVDESIG QIMDYLKQNG LDRNTLVLYC GDQGFYMGEH GLYDKRWIFE ESFRMPLIMR
WPGHIRPGVR SSAMVQELDY APTFCDVAGV NTKENMNTFQ GRSLTPLFKT GEHQDFKNRS
LYYAFYENPG EHNAPRHDGL RTDRYTLSYI WTSDEWMLFD NQKDPAQMHN VINKPEYAET
VKELKALYGK LRKDYQVPEG FPGATGKLAV KPQWDCAPSR D