Dirty buffers with suppress_redundant_updates_trigger - Mailing list pgsql-general
From | Mike Noordermeer |
---|---|
Subject | Dirty buffers with suppress_redundant_updates_trigger |
Date | |
Msg-id | CAF0ozquGxjovEz7T5SgrRBnh_ummfVNqBV1mMyX3Ea15At6-JQ@mail.gmail.com Whole thread Raw |
Responses |
Re: Dirty buffers with suppress_redundant_updates_trigger
|
List | pgsql-general |
Hi, I am currently working on a data sync solution, where data is synced from external systems to Postgres. As I do not know what data changed in the source data store, I have opted for the following flow: - Load all data in a temp table - Update existing rows main table based on contents of temp table - Insert new rows main table - Delete obsolete rows main table Obviously, I want to minimize WAL logging and data writes. Therefore I have added suppress_redundant_updates_trigger to the main table. Nevertheless, I am still seeing pages being dirtied and written to the WAL, even if no rows changed. Is there anyone that could explain why this is happening? And does anyone know of better ways to minimize writes in such a 'data load/refresh' scenario, where data is the same in 99.9% of the cases? Small test case on Postgres 12.4 below, as you can see it dirties 8334 blocksk: testdatabase=# create table testtable (id int not null, name varchar(32), primary key(id)); CREATE TABLE testdatabase=# create temp table testtable_temp (id int not null, name varchar(32), primary key(id)); CREATE TABLE testdatabase=# insert into testtable (id, name) select i, md5(i::text) from generate_series(1,1000000) s(i); INSERT 0 1000000 testdatabase=# insert into testtable_temp select * from testtable; INSERT 0 1000000 testdatabase=# create trigger z_min_update before update on testtable for each row execute procedure suppress_redundant_updates_trigger(); CREATE TRIGGER testdatabase=# checkpoint; CHECKPOINT testdatabase=# explain (analyze, buffers) update testtable as d set name = s.name from testtable_temp as s where d.id = s.id; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------- Update on testtable d (cost=30315.56..69708.58 rows=591714 width=98) (actual time=4168.901..4168.903 rows=0 loops=1) Buffers: shared hit=1008337 dirtied=8334, local read=8334 dirtied=8334 written=8331, temp read=11144 written=11144 -> Hash Join (cost=30315.56..69708.58 rows=591714 width=98) (actual time=486.771..1429.637 rows=1000000 loops=1) Hash Cond: (d.id = s.id) Buffers: shared hit=8337, local read=8334 dirtied=8334 written=8331, temp read=11144 written=11144 -> Seq Scan on testtable d (cost=0.00..18334.00 rows=1000000 width=10) (actual time=0.026..157.729 rows=1000000 loops=1) Buffers: shared hit=8334 -> Hash (cost=14251.14..14251.14 rows=591714 width=92) (actual time=486.128..486.129 rows=1000000 loops=1) Buckets: 32768 Batches: 32 Memory Usage: 2583kB Buffers: local read=8334 dirtied=8334 written=8331, temp written=7549 -> Seq Scan on testtable_temp s (cost=0.00..14251.14 rows=591714 width=92) (actual time=0.035..226.642 rows=1000000 loops=1) Buffers: local read=8334 dirtied=8334 written=8331 Planning Time: 0.429 ms Trigger z_min_update: time=57.069 calls=1000000 Execution Time: 4174.785 ms (15 rows) Thanks, Mike
pgsql-general by date: