Re: Patch: Write Amplification Reduction Method (WARM) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Patch: Write Amplification Reduction Method (WARM)
Date
Msg-id CA+TgmoY4MeTNEx13z24_dwLyjz+ZodO0kR2hTfU3Ws8OH5g=Hg@mail.gmail.com
Whole thread Raw
In response to Re: Patch: Write Amplification Reduction Method (WARM)  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Wed, Mar 29, 2017 at 7:12 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> No as I agreed above, it won't double-compress, but still looks
> slightly risky to rely on different set of values passed to
> index_form_tuple and then compare them.

It assumes that the compressor is completely deterministic, which I'm
fairly is true today, but might be false in the future.  For example:

https://groups.google.com/forum/#!topic/snappy-compression/W8v_ydnEPuc

We've talked about using snappy as a compression algorithm before, and
if the above post is correct, an upgrade to the snappy library version
is an example of a change that would break the assumption in question.
I think it's generally true for almost any modern compression
algorithm (including pglz) that there are multiple compressed texts
that would decompress to the same uncompressed text.  Any algorithm is
required to promise that it will always produce one of the compressed
texts that decompress back to the original, but not necessarily that
it will always produce the same one.

As another example of this, consider that zlib (gzip) has a variety of
options to control compression behavior, such as, most obviously, the
compression level (1 .. 9).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Patch: Write Amplification Reduction Method (WARM)
Next
From: Mithun Cy
Date:
Subject: Re: [POC] A better way to expand hash indexes.