Gene Caul_0237 details

Gene Information       Plasmid Coverage information       Fosmid Coverage information       Sequence       

Gene Information

Locus tagCaul_0237 
Symbol 
ID5897511 
TypeCDS 
Is gene splicedNo 
Is pseudo geneNo 
Organism nameCaulobacter sp. K31 
KingdomBacteria 
Replicon accessionNC_010338 
Strand
Start bp260923 
End bp262947 
Gene Length2025 bp 
Protein Length674 aa 
Translation table11 
GC content67% 
IMG OID641560721 
Productsulfatase 
Protein accessionYP_001681872 
Protein GI167644209 
COG category[P] Inorganic ion transport and metabolism 
COG ID[COG3119] Arylsulfatase A and related enzymes 
TIGRFAM ID 


Plasmid Coverage information

Num covering plasmid clones23 
Plasmid unclonability p-value0.871196 
Plasmid hitchhikingNo 
Plasmid clonabilitynormal 
 

Fosmid Coverage information

Num covering fosmid clones16 
Fosmid unclonability p-value
Fosmid HitchhikerNo 
Fosmid clonabilitynormal 
 

Sequence

Gene sequence
ATGACCCAGG CGAGCCGGCC CAATATCCTG CTGATCACCT GCGACCAGTA TCGCTTTCCC 
CGGTTCTCCT ACGGGGCGGA CGCGGGGTTT AGCGAGCCGC TGAAGCGCAT CCTGGGTTTC
CAGCGCGAGG ACGACGCTCA AAACCCGTAC GCCCCGTACT TTCCCGGCTT GCTGGCCTTG
CGCCAGAACG CGGCGCTGTT GCGCAACCAC ACCATCGCCG CCAGCGCCTG CACGCCCAGT
CGGGCGGTGA TCTATACCGG CCAGTACGGG ACCAAGACCG GGGTCACCCA GACCGACGGC
CTGTTCAAGA GCGGCGACTC CTACAACTTT CCCTGGCTGG CCGCCGACGG TATCCCCACC
CTGGGAACCT GGATGCGCGA GGCCGGCTAC TCCACCCACT ATTTCGGCAA ATGGCACGTC
AGCAACCCGC CCGAGCACTC GCTGGACCGT TACGGCTTCG ACGACTGGGA GGAATCCTAT
CCCGAGCCGC ACGGCGCGGC GATCAACAAC CTGGGCGTCT ATCGCGACGC CGGCTTCACC
GACCAGGCCT GCGCCTTCAT CCGCCGCAAG GCCCTGGCCC TGAACTACAA CCGCGCCCAG
GCCGTCGAGC AGGCGCGGGA CCCCTACGCC GCCGGGCCGG ACGCCGATAA CATCCCGCCC
TGGTTCGCGG TCGCCTCGTT CACCAATCCC CACGACATCG CCACCTATCC GGCGGTGATC
GCCCAGGCTC TGCCGACGCC GGACAATTCC GGCACGCAGT CGATCTTCGG TCCGCTGACC
GTTCCGTTGC AGGGGCAGAA GACGCCGCCG CCGACCGCCG GCACGATCCA GATCGCGCTC
AATGCCCTGG GCTTTCCGCA GGACTGCGCC AAGCCGTCGC CCACCCAGAA CGAGTCCCTG
GCCGACAAGC CCAGCTGCCA GCGCGACTAC GCCTACAAGG TGGGCCTGGC CCTGAACGCC
AAGACCGGCT TCAACATCGT CAACACCGTC GGGTCCAAGC TGCACGACCA GTTCCCCAAT
CTCTCCGAGA CCCCGGACCT GGCGCGGCGG GCGGCGGTCC AGCAGGCGCT GAAGGGGACA
ATCCCCTTCC AGCTGAGCGA CGATCCGGAC GGCTACGCCC TGCAGTTCCT GCAGCTCTAT
GGCTGGCTGC ACGCCGTGGT CGACACCCAC GTGACGGCCG TGCTGAAGAC GCTGGAAGAG
ACGGGCCAGG CCGACAACAC CATCGTCATC TTCCTGGCCG ACCACGGCGA GTACGCGGCG
GCCCACGGCA TGATGATCGA GAAGTGGCAC ACGGCCTATC AGGAGGCCCT GCATGTGCCG
GTGGTCGTGC GCTTCCCGCC ATCGACGAAG GTGGTCGAGA ACGAACCCGG GACGGGGGAG
GGGCCGCTGG GCTTCACGCC GCGCCAGATC GACGCCCTGA CCAGCCATAT CGACATCCTG
CCCACCGTGC TGGGCCTGGC TGGGGTGACG CCCGATCAGC GGACGACGAT CGCCGAGCGC
CTGGGCCGGC ATCGCCCCAC GCCGCCCCTG CCGGGAGTTG ACCTGTCGGG CCTGCTGAAG
GGCGAGATCC ACGCGGTGAT CGAGCCGGAC GGCCGCGAGC GGCAGGGCGT ATTGTTCATC
ACCGACGACG AGATCACCGC CCCCTCGGCC TCGAACGATG ATCCCGCCAA CCTCAAGTGC
GACAAGGAGT TCGAGGTCTA CAGGCAGGTG GTCGAGACGG TGAACGATCA GCATCGGTTG
CTGAACCTGG CGCCAGGTTC GGTGCGCCAG CCCAACCACG TGCGGTGCGT GCGAACCCTG
CGCCACAAGC TCAGCCGCTA TTTCGACCCG TCAGGCGAAG CGGCGGAGGA GTGGGAGATG
TATGATCTCG AGCGCGATCC CAACGAGGCG GTGAACCTGG TGCGGGTGGC CTCGCCGCTG
ACCGCGCGAA CGGACCTGCC GTCGCCGTTC GTGACGGCCG AGGTGCAGGC GGAGGCGGAC
CAACTGGCGA AGCTGCTGGC GGAACTGGAA GCGCGGGATC TGTAA
 
Protein sequence
MTQASRPNIL LITCDQYRFP RFSYGADAGF SEPLKRILGF QREDDAQNPY APYFPGLLAL 
RQNAALLRNH TIAASACTPS RAVIYTGQYG TKTGVTQTDG LFKSGDSYNF PWLAADGIPT
LGTWMREAGY STHYFGKWHV SNPPEHSLDR YGFDDWEESY PEPHGAAINN LGVYRDAGFT
DQACAFIRRK ALALNYNRAQ AVEQARDPYA AGPDADNIPP WFAVASFTNP HDIATYPAVI
AQALPTPDNS GTQSIFGPLT VPLQGQKTPP PTAGTIQIAL NALGFPQDCA KPSPTQNESL
ADKPSCQRDY AYKVGLALNA KTGFNIVNTV GSKLHDQFPN LSETPDLARR AAVQQALKGT
IPFQLSDDPD GYALQFLQLY GWLHAVVDTH VTAVLKTLEE TGQADNTIVI FLADHGEYAA
AHGMMIEKWH TAYQEALHVP VVVRFPPSTK VVENEPGTGE GPLGFTPRQI DALTSHIDIL
PTVLGLAGVT PDQRTTIAER LGRHRPTPPL PGVDLSGLLK GEIHAVIEPD GRERQGVLFI
TDDEITAPSA SNDDPANLKC DKEFEVYRQV VETVNDQHRL LNLAPGSVRQ PNHVRCVRTL
RHKLSRYFDP SGEAAEEWEM YDLERDPNEA VNLVRVASPL TARTDLPSPF VTAEVQAEAD
QLAKLLAELE ARDL