Dirty buffers with suppress_redundant_updates_trigger - Mailing list pgsql-general

From Mike Noordermeer
Subject Dirty buffers with suppress_redundant_updates_trigger
Date
Msg-id CAF0ozquGxjovEz7T5SgrRBnh_ummfVNqBV1mMyX3Ea15At6-JQ@mail.gmail.com
Whole thread Raw
Responses Re: Dirty buffers with suppress_redundant_updates_trigger
List pgsql-general
Hi,

I am currently working on a data sync solution, where data is synced
from external systems to Postgres. As I do not know what data changed
in the source data store, I have opted for the following flow:

- Load all data in a temp table
- Update existing rows main table based on contents of temp table
- Insert new rows main table
- Delete obsolete rows main table

Obviously, I want to minimize WAL logging and data writes. Therefore I
have added suppress_redundant_updates_trigger to the main table.

Nevertheless, I am still seeing pages being dirtied and written to the
WAL, even if no rows changed. Is there anyone that could explain why
this is happening? And does anyone know of better ways to minimize
writes in such a 'data load/refresh' scenario, where data is the same
in 99.9% of the cases?

Small test case on Postgres 12.4 below, as you can see it dirties 8334 blocksk:

testdatabase=# create table testtable (id int not null, name
varchar(32), primary key(id));
CREATE TABLE
testdatabase=# create temp table testtable_temp (id int not null, name
varchar(32), primary key(id));
CREATE TABLE
testdatabase=# insert into testtable (id, name) select i, md5(i::text)
from generate_series(1,1000000) s(i);
INSERT 0 1000000
testdatabase=# insert into testtable_temp select * from testtable;
INSERT 0 1000000
testdatabase=# create trigger z_min_update before update on testtable
for each row execute procedure suppress_redundant_updates_trigger();
CREATE TRIGGER
testdatabase=# checkpoint;
CHECKPOINT
testdatabase=# explain (analyze, buffers) update testtable as d set
name = s.name from testtable_temp as s where d.id = s.id;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------
 Update on testtable d  (cost=30315.56..69708.58 rows=591714 width=98)
(actual time=4168.901..4168.903 rows=0 loops=1)
   Buffers: shared hit=1008337 dirtied=8334, local read=8334
dirtied=8334 written=8331, temp read=11144 written=11144
   ->  Hash Join  (cost=30315.56..69708.58 rows=591714 width=98)
(actual time=486.771..1429.637 rows=1000000 loops=1)
         Hash Cond: (d.id = s.id)
         Buffers: shared hit=8337, local read=8334 dirtied=8334
written=8331, temp read=11144 written=11144
         ->  Seq Scan on testtable d  (cost=0.00..18334.00
rows=1000000 width=10) (actual time=0.026..157.729 rows=1000000
loops=1)
               Buffers: shared hit=8334
         ->  Hash  (cost=14251.14..14251.14 rows=591714 width=92)
(actual time=486.128..486.129 rows=1000000 loops=1)
               Buckets: 32768  Batches: 32  Memory Usage: 2583kB
               Buffers: local read=8334 dirtied=8334 written=8331,
temp written=7549
               ->  Seq Scan on testtable_temp s  (cost=0.00..14251.14
rows=591714 width=92) (actual time=0.035..226.642 rows=1000000
loops=1)
                     Buffers: local read=8334 dirtied=8334 written=8331
 Planning Time: 0.429 ms
 Trigger z_min_update: time=57.069 calls=1000000
 Execution Time: 4174.785 ms
(15 rows)

Thanks,

Mike



pgsql-general by date:

Previous
From: Alexander Farber
Date:
Subject: Re: How to call JSONB_INSERT with integer as the new to-be-inserted value?
Next
From: Laurenz Albe
Date:
Subject: Re: Dirty buffers with suppress_redundant_updates_trigger