Re: Compress ReorderBuffer spill files using LZ4 - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Compress ReorderBuffer spill files using LZ4 |
Date | |
Msg-id | 87060757-4936-4e3b-8d49-418f71d07eb3@enterprisedb.com Whole thread Raw |
In response to | Re: Compress ReorderBuffer spill files using LZ4 (Julien Tachoires <julmon@gmail.com>) |
Responses |
Re: Compress ReorderBuffer spill files using LZ4
Re: Compress ReorderBuffer spill files using LZ4 |
List | pgsql-hackers |
On 7/15/24 20:50, Julien Tachoires wrote: > Hi, > > Le ven. 7 juin 2024 à 06:18, Julien Tachoires <julmon@gmail.com> a écrit : >> >> Le ven. 7 juin 2024 à 05:59, Tomas Vondra >> <tomas.vondra@enterprisedb.com> a écrit : >>> >>> On 6/6/24 12:58, Julien Tachoires wrote: >>>> ... >>>> >>>> When compiled with LZ4 support (--with-lz4), this patch enables data >>>> compression/decompression of these temporary files. Each transaction >>>> change that must be written on disk (ReorderBufferDiskChange) is now >>>> compressed and encapsulated in a new structure. >>>> >>> >>> I'm a bit confused, but why tie this to having lz4? Why shouldn't this >>> be supported even for pglz, or whatever algorithms we add in the future? >> >> That's right, reworking this patch in that sense. > > Please find a new version of this patch adding support for LZ4, pglz > and ZSTD. It introduces the new GUC logical_decoding_spill_compression > which is used to set the compression method. In order to stay aligned > with the other server side GUCs related to compression methods > (wal_compression, default_toast_compression), the compression level is > not exposed to users. > Sounds reasonable. I wonder if it might be useful to allow specifying the compression level in those places, but that's clearly not something this patch needs to do. > The last patch of this set is still in WIP, it adds the machinery > required for setting the compression methods as a subscription option: > CREATE SUBSCRIPTION ... WITH (spill_compression = ...); > I think there is a major problem with this approach: the logical > decoding context is tied to one replication slot, but multiple > subscriptions can use the same replication slot. How should this work > if 2 subscriptions want to use the same replication slot but different > compression methods? > Do we really support multiple subscriptions sharing the same slot? I don't think we do, but maybe I'm missing something. > At this point, compression is only available for the changes spilled > on disk. It is still not clear to me if the compression of data > transiting through the streaming protocol should be addressed by this > patch set or by another one. Thought ? > I'd stick to only compressing the data spilled to disk. It might be useful to compress the streamed data too, but why shouldn't we compress the regular (non-streamed) transactions too? Yeah, it's more efficient to compress larger chunks, but we can fit quite large transactions into logical_decoding_work_mem without spilling. FWIW I'd expect that to be handled at the libpq level - there's already a patch for that, but I haven't checked if it would handle this. But maybe more importantly, I think compressing streamed data might need to handle some sort of negotiation of the compression algorithm, which seems fairly complex. To conclude, I'd leave this out of scope for this patch. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: