On Thu, Jun 6, 2024 at 7:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2024-Jun-06, Amit Kapila wrote:
>
> > On Thu, Jun 6, 2024 at 4:28 PM Julien Tachoires <julmon@gmail.com> wrote:
> > >
> > > When the content of a large transaction (size exceeding
> > > logical_decoding_work_mem) and its sub-transactions has to be
> > > reordered during logical decoding, then, all the changes are written
> > > on disk in temporary files located in pg_replslot/<slot_name>.
> > > Decoding very large transactions by multiple replication slots can
> > > lead to disk space saturation and high I/O utilization.
>
> I like the general idea of compressing the output of logical decoding.
> It's not so clear to me that we only want to do so for spilling to disk;
> for instance, if the two nodes communicate over a slow network, it may
> even be beneficial to compress when streaming, so to this question:
>
> > Why can't one use 'streaming' option to send changes to the client
> > once it reaches the configured limit of 'logical_decoding_work_mem'?
>
> I would say that streaming doesn't necessarily have to mean we don't
> want compression, because for some users it might be beneficial.
+1
> I think a GUC would be a good idea. Also, what if for whatever reason
> you want a different compression algorithm or different compression
> parameters? Looking at the existing compression UI we offer in
> pg_basebackup, perhaps you could add something like this:
>
> compress_logical_decoding = none
> compress_logical_decoding = lz4:42
> compress_logical_decoding = spill-zstd:99
>
> "none" says to never use compression (perhaps should be the default),
> "lz4:42" says to use lz4 with parameters 42 on both spilling and
> streaming, and "spill-zstd:99" says to use Zstd with parameter 99 but
> only for spilling to disk.
>
I think the compression option should be supported at the CREATE
SUBSCRIPTION level instead of being controlled by a GUC. This way, we
can decide on compression for each subscription individually rather
than applying it to all subscribers. It makes more sense for the
subscriber to control this, especially when we are planning to
compress the data sent downstream.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com