Hi Floris,
On Jul 16, 2024, at 19:47, Floris Van Nee <florisvannee@Optiver.com> wrote:
Hi hackers,
I'm looking for some input on an issue I've observed. A common pattern
I've seen is using temporary tables to put data in before updating the
real tables. Something roughly like:
On session start:
CREATE TEMP TABLE temp_t1 (...) ON COMMIT DELETE ROWS;
On update:
BEGIN;
COPY temp_t1 FROM STDIN (FORMAT BINARY);
INSERT INTO t1 (SELECT * FROM temp_t1 ...) ON CONFLICT DO UPDATE SET ...;
-- potentially some other operations on temp table to put data into real table t1
COMMIT;
This pattern starts to break down under certain exceptional circumstances of
high concurrency. The "ON COMMIT DELETE ROWS" does a truncate that is
fairly expensive and doesn't work well in high-concurrency scenarios. It's
especially noticeable under following circumstances:
- high max_connections setting
- high number of temp tables per session
- concurrent writers at fairly short intervals
Impact is on both TPS on primary as well as that the WAL replay process
on replica becomes completely overloaded (100% cpu even though not
a lot of WAL is being generated)
A very simple pgbench example that showcases degradation (taken
with max_connections=2000 to clearly show it).
I also encountered the similar performance issue with temporary tables
andprovided a patch to optimize the truncate performance during commit
in [1].
Additionally, is it possible to lower the lock level held during truncate for
temporary tables?
[1] https://www.postgresql.org/message-id/flat/tencent_924E990F0493010E2C8404A5D677C70C9707%40qq.com
Best Regards,
Fei Changhong