Re: Compress ReorderBuffer spill files using LZ4 - Mailing list pgsql-hackers

From Julien Tachoires
Subject Re: Compress ReorderBuffer spill files using LZ4
Date
Msg-id CAFEQCbEXtb3vPnwdbaSASx+tBLX8qvXsC08qvL_EMiczV2=vow@mail.gmail.com
Whole thread Raw
In response to Re: Compress ReorderBuffer spill files using LZ4  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Compress ReorderBuffer spill files using LZ4
List pgsql-hackers
Le jeu. 6 juin 2024 à 04:13, Amit Kapila <amit.kapila16@gmail.com> a écrit :
>
> On Thu, Jun 6, 2024 at 4:28 PM Julien Tachoires <julmon@gmail.com> wrote:
> >
> > When the content of a large transaction (size exceeding
> > logical_decoding_work_mem) and its sub-transactions has to be
> > reordered during logical decoding, then, all the changes are written
> > on disk in temporary files located in pg_replslot/<slot_name>.
> > Decoding very large transactions by multiple replication slots can
> > lead to disk space saturation and high I/O utilization.
> >
>
> Why can't one use 'streaming' option to send changes to the client
> once it reaches the configured limit of 'logical_decoding_work_mem'?

That's right, setting subscription's option 'streaming' to 'on' moves
the problem away from the publisher to the subscribers. This patch
tries to improve the default situation when 'streaming' is set to
'off'.

> > 2. Do we want a GUC to switch compression on/off?
> >
>
> It depends on the overhead of decoding. Did you try to measure the
> decoding overhead of decompression when reading compressed files?

Quick benchmarking executed on my laptop shows 1% overhead.

Table DDL:
CREATE TABLE t (i INTEGER PRIMARY KEY, t TEXT);

Data generated with:
INSERT INTO t SELECT i, 'Text number n°'||i::TEXT FROM
generate_series(1, 10000000) as i;

Restoration duration measured using timestamps of log messages:
"DEBUG:  restored XXXX/YYYY changes from disk"

HEAD: 25.54s, 25.94s, 25.516s, 26.267s, 26.11s / avg=25.874s
Patch: 26.872s, 26.311s, 25.753s, 26.003, 25.843s / avg=26.156s

Regards,

JT



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: How about using dirty snapshots to locate dependent objects?
Next
From: Pantelis Theodosiou
Date:
Subject: Re: Postgresql OOM