Home > mailing lists

Re: row filtering for logical replication - Mailing list pgsql-hackers

From	Peter Smith
Subject	Re: row filtering for logical replication
Date	September 14, 2021 03:08:58
Msg-id	CAHut+Pt6+=w7_r=CHBCS+yZXk5V+tnrzHLi3b2ZOVP1LHL2W9w@mail.gmail.com Whole thread
In response to	Re: row filtering for logical replication (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: row filtering for logical replication
List	pgsql-hackers

Tree view

I have attached a POC row-filter validation patch implemented using a
parse-tree 'walker' function.

PSA the incremental patch v28-0003.

v28-0001 --> v28-0001 (same as before - base patch)
v28-0002 --> v28-0002 (same as before - replica identity validation patch)
v28-0003 (NEW POC PATCH using "walker" validation)

This kind of 'walker' validation has been proposed/recommended already
several times up-thread. [1][2][3].

For this POC patch, I have removed all the existing
EXPR_KIND_PUBLICATION_WHERE parser errors. I am not 100% sure this is
the best idea (see below), but for now, the parser errors are
temporarily #if 0 in the code. I will clean up this patch and re-post
later when there is some feedback/consensus on how to proceed.

1. PROS

1.1 Using a 'walker' validator allows the row filter expression
validation to be 'opt-in' instead of 'opt-out' checking logic. This
may be considered *safer* because now we can have a very
controlled/restricted set of allowed nodes - e.g. only allow simple
(Var op Const) expressions. This eliminates the risk that some
unforeseen dangerous loophole could be exploited.

1.2 It is convenient to have all the row-filter validation errors in
one place, instead of being scattered across the parser code based on
EXPR_KIND_PUBLICATION_WHERE. Indeed, there seems some confusion
already caused by the existing scattering of row-filter validation
(patch 0001). For example, I found some of the new "aggregate
functions are not allowed" errors are not even reachable because they
are shielded by the earlier "functions are not allowed" error.

2. CONS

2.1 Error messages thrown from the parser can include the character
location of the problem. Actually, this is also possible using the
'walker' (I have done it locally) but it requires passing the
ParseState into the walker code - something I thought seemed a bit
unusual, so I did not include that in this 0003 POC patch.

Perhaps a hybrid validation is preferred. e.g. retain some/all of the
parser validation errors from the 0001 patch, but also keep the walker
validation as a 'catch-all' to trap anything unforeseen that may slip
through the parsing. Or perhaps this 'walker' validator is fine as the
only validator and all the current parser errors for
EXPR_KIND_PUBLICATION_WHERE can just be permanently removed.

I am not sure what is the best approach, so I am hoping for some
feedback and/or review comments.

------
[1] https://www.postgresql.org/message-id/33c033f7-be44-e241-5fdf-da1b328c288d%40enterprisedb.com
[2] https://www.postgresql.org/message-id/CAA4eK1Jumuio6jZK8AVQd6z7gpDsZydQhK6d%3DMUARxk3nS7%2BPw%40mail.gmail.com
[3] https://www.postgresql.org/message-id/CAA4eK1JL2q%2BHENgiCf1HLRU7nD9jCcttB9sEqV1tech4mMv_0A%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment

pgsql-hackers by date:

From: Andres Freund
Date: 14 September 2021, 02:57:07
Subject: Re: postgres.h included from relcache.h - but removing it breaks pg_upgrade

From: Amit Kapila
Date: 14 September 2021, 03:09:17
Subject: Re: PG Docs - CREATE SUBSCRIPTION option list order

Re: row filtering for logical replication - Mailing list pgsql-hackers

Attachment

Previous

Next