Re: [HACKERS] Custom compression methods - Mailing list pgsql-hackers
From | Ildus Kurbangaliev |
---|---|
Subject | Re: [HACKERS] Custom compression methods |
Date | |
Msg-id | 20180423121909.60b2b7f7@wp.localdomain Whole thread Raw |
In response to | Re: [HACKERS] Custom compression methods (Alexander Korotkov <a.korotkov@postgrespro.ru>) |
List | pgsql-hackers |
On Sun, 22 Apr 2018 16:21:31 +0300 Alexander Korotkov <a.korotkov@postgrespro.ru> wrote: > On Fri, Apr 20, 2018 at 7:45 PM, Konstantin Knizhnik < > k.knizhnik@postgrespro.ru> wrote: > > > On 30.03.2018 19:50, Ildus Kurbangaliev wrote: > > > >> On Mon, 26 Mar 2018 20:38:25 +0300 > >> Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru> wrote: > >> > >> Attached rebased version of the patch. Fixed conflicts in > >> pg_class.h. > >>> > >>> New rebased version due to conflicts in master. Also fixed few > >>> errors > >> and removed cmdrop method since it couldnt be tested. > >> > >> I seems to be useful (and not so difficult) to use custom > >> compression > > methods also for WAL compression: replace direct calls of > > pglz_compress in xloginsert.c > > > I'm going to object this at point, and I've following arguments for > that: > > 1) WAL compression is much more critical for durability than datatype > compression. Imagine, compression algorithm contains a bug which > cause decompress method to issue a segfault. In the case of datatype > compression, that would cause crash on access to some value which > causes segfault; but in the rest database will be working giving you > a chance to localize the issue and investigate that. In the case of > WAL compression, recovery would cause a server crash. That seems > to be much more serious disaster. You wouldn't be able to make > your database up and running and the same happens on the standby. > > 2) Idea of custom compression method is that some columns may > have specific data distribution, which could be handled better with > particular compression method and particular parameters. In the > WAL compression you're dealing with the whole WAL stream containing > all the values from database cluster. Moreover, if custom compression > method are defined for columns, then in WAL stream you've values > already compressed in the most efficient way. However, it might > appear that some compression method is better for WAL in general > case (there are benchmarks showing our pglz is not very good in > comparison to the alternatives). But in this case I would prefer to > just switch our WAL to different compression method one day. > Thankfully we don't preserve WAL compatibility between major releases. > > 3) This patch provides custom compression methods recorded in > the catalog. During recovery you don't have access to the system > catalog, because it's not recovered yet, and can't fetch compression > method metadata from there. The possible thing is to have GUC, > which stores shared module and function names for WAL compression. > But that seems like quite different mechanism from the one present > in this patch. > > Taking into account all of above, I think we would give up with custom > WAL compression method. Or, at least, consider it unrelated to this > patch. I agree with these points. I also think this should be done in another patch. It's not so hard to implement but would make sense if there will be few more builtin compression methods suitable for wal compression. Some static array could contain function pointers for direct calls. -- --- Ildus Kurbangaliev Postgres Professional: http://www.postgrespro.com Russian Postgres Company
pgsql-hackers by date: