Home > mailing lists

Re: Proposal: Adding compression of temporary files - Mailing list pgsql-hackers

From	Filip Janus
Subject	Re: Proposal: Adding compression of temporary files
Date	January 18 18:50:24
Msg-id	CAFjYY+JDSpOQwYAfTQQ43=BA=d32XfcAdaPVJgHheV9fQBbLWg@mail.gmail.com Whole thread Raw
In response to	Re: Proposal: Adding compression of temporary files (Filip Janus <fjanus@redhat.com>)
Responses	Re: Proposal: Adding compression of temporary files
List	pgsql-hackers

Tree view

Hi,

Thank you, Tomas, for the thorough and detailed review!

I'm posting an updated patch set incorporating the changes from your review.

Changes applied from review:

- Simplified BufFileCreateTemp interface
- Improved error handling in BufFileLoadBuffer/BufFileDumpBuffer
- Unified compression header format (CompressHeader struct)
- Added tuplestore integration (compression when EXEC_FLAG_BACKWARD is not required)
- Various code cleanups and comment improvements
Additional change (not from review):

- Switched from static shared buffer to per-file allocation. The shared buffer

provided a negligible performance benefit while keeping memory allocated for the backend's lifetime.

Future work:
- Support for additional compression methods (gzip, zstd)
- Random access and seek operations with compression

-Filip-

út 13. 1. 2026 v 14:34 odesílatel Filip Janus <fjanus@redhat.com> napsal:

Hi,
Yes, it needs to be rebased. I am working on it. I will post it here soon.

-Filip-

út 13. 1. 2026 v 13:51 odesílatel lakshmi <lakshmigcdac@gmail.com> napsal:
Hi all,
I tried to replicate the temporary file compression issue by applying the two patches shared in the thread on current PostgreSQL master.
here is what i observed,
1) patch 1:0001-Add-transparent-compression-for-temporary-files.patch
when applying the first patch it ultimately fails to apply due to context mismatches.

failures i see are in the following files:
src/backend/storage/file/buffile.c
src/backend/utils/misc/guc_tables.c
src/backend/utils/misc/postgresql.conf.sample

2) The second patch 0002-Add-regression-tests-for-temporary-file-compression.patch ,applies successfully without any issues.

Does it mean that the implementation patch needs to be rebased or otherwise adjusted for the current codebase, and if so, what would be the recommended way to proceed?could you please suggest how I should apply the implementation patch in this case?

regards
lakshmi

On Tue, Jan 13, 2026 at 5:01 PM Filip Janus <fjanus@redhat.com> wrote:
Rebase after changes introduced in guc_tables.c

-Filip-

út 19. 8. 2025 v 17:48 odesílatel Filip Janus <fjanus@redhat.com> napsal:
Fix overlooked compiler warnings

-Filip-

po 18. 8. 2025 v 18:51 odesílatel Filip Janus <fjanus@redhat.com> napsal:
I rebased the proposal and fixed the problem causing those problems.

-Filip-

út 17. 6. 2025 v 16:49 odesílatel Andres Freund <andres@anarazel.de> napsal:
Hi,

On 2025-04-25 23:54:00 +0200, Filip Janus wrote:
> The latest rebase.

This often seems to fail during tests:
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/cf%2F5382

E.g.
https://api.cirrus-ci.com/v1/artifact/task/4667337632120832/testrun/build-32/testrun/recovery/027_stream_regress/log/regress_log_027_stream_regress

=== dumping /tmp/cirrus-ci-build/build-32/testrun/recovery/027_stream_regress/data/regression.diffs ===
diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/join_hash_pglz.out /tmp/cirrus-ci-build/build-32/testrun/recovery/027_stream_regress/data/results/join_hash_pglz.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/join_hash_pglz.out 2025-05-26 05:04:40.686524215 +0000
+++ /tmp/cirrus-ci-build/build-32/testrun/recovery/027_stream_regress/data/results/join_hash_pglz.out 2025-05-26 05:15:00.534907680 +0000
@@ -594,11 +594,8 @@
select count(*) from join_foo
left join (select b1.id, b1.t from join_bar b1 join join_bar b2 using (id)) ss
on join_foo.id < ss.id + 1 and join_foo.id > ss.id - 1;
- count
--------
- 3
-(1 row)
-
+ERROR: could not read from temporary file: read only 8180 of 1572860 bytes
+CONTEXT: parallel worker
select final > 1 as multibatch
from hash_join_batches(
$$
@@ -606,11 +603,7 @@
left join (select b1.id, b1.t from join_bar b1 join join_bar b2 using (id)) ss
on join_foo.id < ss.id + 1 and join_foo.id > ss.id - 1;
$$);
- multibatch
-------------
- t
-(1 row)
-
+ERROR: current transaction is aborted, commands ignored until end of transaction block
rollback to settings;
-- single-batch with rescan, parallel-oblivious
savepoint settings;

Greetings,

Andres

Attachment

pgsql-hackers by date:

From: Henson Choi
Date: 18 January, 18:32:40
Subject: Re: Row pattern recognition

From: Sami Imseih
Date: 18 January, 19:16:16
Subject: Re: Cleaning up PREPARE query strings?

Re: Proposal: Adding compression of temporary files - Mailing list pgsql-hackers

Attachment

Previous

Next