Re: row filtering for logical replication - Mailing list pgsql-hackers

From Euler Taveira
Subject Re: row filtering for logical replication
Date
Msg-id 0c2464d4-65f4-4d91-aeb2-c5584c1350f5@www.fastmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: row filtering for logical replication
Re: row filtering for logical replication
Re: row filtering for logical replication
List pgsql-hackers
On Sun, Aug 29, 2021, at 11:14 PM, Peter Smith wrote:
Here are the new v26* patches. This is a refactoring of the row-filter
caches to remove all the logic from the get_rel_sync_entry function
and delay it until if/when needed in the pgoutput_row_filter function.
This is now implemented per Amit's suggestion to move all the cache
code [1]. It is a replacement for the v25* patches.

The make check and TAP subscription tests are all OK. I have repeated
the performance tests [2] and those results are good too.

v26-0001 <--- v23 (base RF patch)
v26-0002 <--- ExprState cache mods (refactored row filter caching)
v26-0002 <--- ExprState cache extra debug logging (temp)
Peter, I'm still reviewing this new cache mechanism. I will provide a feedback
as soon as I integrate it as part of this recent modification.

I'm attaching a new version that simply including Houzj review [1]. This is
based on v23.

There has been a discussion about which row should be used by row filter. We
don't have a unanimous choice, so I think it is prudent to provide a way for
the user to change it. I suggested in a previous email [2] that a publication
option should be added. Hence, row filter can be applied to old tuple, new
tuple, or both. This approach is simpler than using OLD/NEW references (less
code and avoid validation such as NEW reference for DELETEs and OLD reference
for INSERTs). I think about a reasonable default value and it seems _new_ tuple
is a good one because (i) it is always available and (ii) user doesn't have
to figure out that replication is broken due to a column that is not part
of replica identity. I'm attaching a POC that implements it. I'm still
polishing it. Add tests for multiple row filters and integrate Peter's caching
mechanism [3] are the next steps.



--
Euler Taveira

Attachment

pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: postgres_fdw: Handle boolean comparison predicates
Next
From: Amit Kapila
Date:
Subject: Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)