Re: Compress ReorderBuffer spill files using LZ4 - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Compress ReorderBuffer spill files using LZ4
Date
Msg-id CAFiTN-sQ04eVFHSS61P_o_ni5YzanpUitMnesgRYS+jQq1OQXA@mail.gmail.com
Whole thread Raw
In response to Re: Compress ReorderBuffer spill files using LZ4  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: Compress ReorderBuffer spill files using LZ4
Re: Compress ReorderBuffer spill files using LZ4
List pgsql-hackers
On Thu, Jun 6, 2024 at 7:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2024-Jun-06, Amit Kapila wrote:
>
> > On Thu, Jun 6, 2024 at 4:28 PM Julien Tachoires <julmon@gmail.com> wrote:
> > >
> > > When the content of a large transaction (size exceeding
> > > logical_decoding_work_mem) and its sub-transactions has to be
> > > reordered during logical decoding, then, all the changes are written
> > > on disk in temporary files located in pg_replslot/<slot_name>.
> > > Decoding very large transactions by multiple replication slots can
> > > lead to disk space saturation and high I/O utilization.
>
> I like the general idea of compressing the output of logical decoding.
> It's not so clear to me that we only want to do so for spilling to disk;
> for instance, if the two nodes communicate over a slow network, it may
> even be beneficial to compress when streaming, so to this question:
>
> > Why can't one use 'streaming' option to send changes to the client
> > once it reaches the configured limit of 'logical_decoding_work_mem'?
>
> I would say that streaming doesn't necessarily have to mean we don't
> want compression, because for some users it might be beneficial.

+1

> I think a GUC would be a good idea.  Also, what if for whatever reason
> you want a different compression algorithm or different compression
> parameters?  Looking at the existing compression UI we offer in
> pg_basebackup, perhaps you could add something like this:
>
> compress_logical_decoding = none
> compress_logical_decoding = lz4:42
> compress_logical_decoding = spill-zstd:99
>
> "none" says to never use compression (perhaps should be the default),
> "lz4:42" says to use lz4 with parameters 42 on both spilling and
> streaming, and "spill-zstd:99" says to use Zstd with parameter 99 but
> only for spilling to disk.
>

I think the compression option should be supported at the CREATE
SUBSCRIPTION level instead of being controlled by a GUC. This way, we
can decide on compression for each subscription individually rather
than applying it to all subscribers. It makes more sense for the
subscriber to control this, especially when we are planning to
compress the data sent downstream.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: using __func__ to locate and distinguish some error messages
Next
From: jian he
Date:
Subject: Re: using __func__ to locate and distinguish some error messages