Re: row filtering for logical replication - Mailing list pgsql-hackers

From Euler Taveira
Subject Re: row filtering for logical replication
Date
Msg-id d65b0801-34bb-4db2-b0d4-8e39852c9f9d@www.fastmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: row filtering for logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Re: row filtering for logical replication  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Tue, Aug 24, 2021, at 4:46 AM, Peter Smith wrote:
I have used debug logging to confirm that what Amit wrote [1] is
correct; the row-filter ExprState of *every* table's row_filter will
be invalidated (and so subsequently gets rebuilt) when the user
changes the PUBLICATION tables. This was a side-effect of the
rel_sync_cache_publication_cb which is freeing the cached ExprState
and setting the entry->replicate_valid = false; for *every* entry.

So yes, the ExprCache is getting rebuilt for some situations where it
is not strictly necessary to do so.
I'm afraid we are overenginnering this feature. We already have a cache
mechanism that was suggested (that shows a small improvement). As you said the
gain for this new improvement is zero or minimal (it depends on your logical
replication setup/maintenance).

1. Although the ExprState cache is effective, in practice the
performance improvement was not very much. My previous results [2]
showed only about 2sec saving for 100K calls to the
pgoutput_row_filter function. So I think eliminating just one or two
unnecessary calls in the get_rel_sync_entry is going to make zero
observable difference.

2. IMO it is safe to expect that the ALTER PUBLICATION is a rare
operation relative to the number of times that pgoutput_row_filter
will be called (the pgoutput_row_filter is quite a "hot" function
since it is called for every INSERT/UPDATE/DELETE). It will be orders
of magnitude difference 1:1000, 1:100000 etc.

~~

Anyway, I have implemented the suggested cache change because I agree
it is probably theoretically superior, even if in practice there is
almost no difference.
I didn't inspect your patch carefully but it seems you add another List to
control this new cache mechanism. I don't like it. IMO if we can use the data
structures that we have now, let's implement your idea; otherwise, -1 for this
new micro optimization.

[By the way, it took some time to extract what you changed. Since we're trading
patches, I personally appreciate if you can send a patch on the top of the
current one. I have some changes too and it is time consuming incorporating
changes in the main patch.]


--
Euler Taveira

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Tab completion for "create unlogged" a bit too lax?
Next
From: Fujii Masao
Date:
Subject: Re: Some leftovers of recent message cleanup?