Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers

From Ildus Kurbangaliev
Subject Re: [HACKERS] Custom compression methods
Date
Msg-id 20180423121909.60b2b7f7@wp.localdomain
Whole thread Raw
In response to Re: [HACKERS] Custom compression methods  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-hackers
On Sun, 22 Apr 2018 16:21:31 +0300
Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:

> On Fri, Apr 20, 2018 at 7:45 PM, Konstantin Knizhnik <
> k.knizhnik@postgrespro.ru> wrote:  
> 
> > On 30.03.2018 19:50, Ildus Kurbangaliev wrote:
> >  
> >> On Mon, 26 Mar 2018 20:38:25 +0300
> >> Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru> wrote:
> >>
> >> Attached rebased version of the patch. Fixed conflicts in
> >> pg_class.h.  
> >>>
> >>> New rebased version due to conflicts in master. Also fixed few
> >>> errors  
> >> and removed cmdrop method since it couldnt be tested.
> >>
> >>  I seems to be useful (and not so difficult) to use custom
> >> compression  
> > methods also for WAL compression: replace direct calls of
> > pglz_compress in xloginsert.c  
> 
> 
> I'm going to object this at point, and I've following arguments for
> that:
> 
> 1) WAL compression is much more critical for durability than datatype
> compression.  Imagine, compression algorithm contains a bug which
> cause decompress method to issue a segfault.  In the case of datatype
> compression, that would cause crash on access to some value which
> causes segfault; but in the rest database will be working giving you
> a chance to localize the issue and investigate that.  In the case of
> WAL compression, recovery would cause a server crash.  That seems
> to be much more serious disaster.  You wouldn't be able to make
> your database up and running and the same happens on the standby.
> 
> 2) Idea of custom compression method is that some columns may
> have specific data distribution, which could be handled better with
> particular compression method and particular parameters.  In the
> WAL compression you're dealing with the whole WAL stream containing
> all the values from database cluster.  Moreover, if custom compression
> method are defined for columns, then in WAL stream you've values
> already compressed in the most efficient way.  However, it might
> appear that some compression method is better for WAL in general
> case (there are benchmarks showing our pglz is not very good in
> comparison to the alternatives).  But in this case I would prefer to
> just switch our WAL to different compression method one day.
> Thankfully we don't preserve WAL compatibility between major releases.
> 
> 3) This patch provides custom compression methods recorded in
> the catalog.  During recovery you don't have access to the system
> catalog, because it's not recovered yet, and can't fetch compression
> method metadata from there.  The possible thing is to have GUC,
> which stores shared module and function names for WAL compression.
> But that seems like quite different mechanism from the one present
> in this patch.
> 
> Taking into account all of above, I think we would give up with custom
> WAL compression method.  Or, at least, consider it unrelated to this
> patch.

I agree with these points. I also think this should be done in another
patch. It's not so hard to implement but would make sense if there will
be few more builtin compression methods suitable for wal compression.
Some static array could contain function pointers for direct calls.

-- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BGWORKER_BYPASS_ALLOWCONN used nowhere (infra part of on-linechecksum switcher)
Next
From: Konstantin Knizhnik
Date:
Subject: Re: [HACKERS] Custom compression methods