Compress ReorderBuffer spill files using LZ4 - Mailing list pgsql-hackers

From Julien Tachoires
Subject Compress ReorderBuffer spill files using LZ4
Date
Msg-id CAFEQCbEQt-MkOFC8zzxEYovFGyHTzc4FgMs9SEwOSKSB8LaCVQ@mail.gmail.com
Whole thread Raw
Responses Re: Compress ReorderBuffer spill files using LZ4
Re: Compress ReorderBuffer spill files using LZ4
List pgsql-hackers
Hi,

When the content of a large transaction (size exceeding
logical_decoding_work_mem) and its sub-transactions has to be
reordered during logical decoding, then, all the changes are written
on disk in temporary files located in pg_replslot/<slot_name>.
Decoding very large transactions by multiple replication slots can
lead to disk space saturation and high I/O utilization.

When compiled with LZ4 support (--with-lz4), this patch enables data
compression/decompression of these temporary files. Each transaction
change that must be written on disk (ReorderBufferDiskChange) is now
compressed and encapsulated in a new structure.

3 different compression strategies are implemented:

1. LZ4 streaming compression is the preferred one and works
   efficiently for small individual changes.
2. LZ4 regular compression when the changes are too large for using
   the streaming API.
3. No compression when compression fails, the change is then stored
   not compressed.

When not using compression, the following case generates 1590MB of
spill files:

  CREATE TABLE t (i INTEGER PRIMARY KEY, t TEXT);
  INSERT INTO t
    SELECT i, 'Hello number n°'||i::TEXT
    FROM generate_series(1, 10000000) as i;

With LZ4 compression, it creates 653MB of spill files: 58.9% less
disk space usage.

Open items:

1. The spill_bytes column from pg_stat_get_replication_slot() still returns
plain data size, not the compressed data size. Should we expose the
compressed data size when compression occurs?

2. Do we want a GUC to switch compression on/off?

Regards,

JT

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Proposal: Job Scheduler
Next
From: Amit Kapila
Date:
Subject: Re: Logical Replication of sequences