Re: row filtering for logical replication - Mailing list pgsql-hackers

From Ajin Cherian
Subject Re: row filtering for logical replication
Date
Msg-id CAFPTHDamz7R=eP1thsYuAGjhywU8y9dOOg1KUAjXC30Jr+C6xg@mail.gmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: row filtering for logical replication
Re: row filtering for logical replication
Re: row filtering for logical replication
Re: row filtering for logical replication
Re: row filtering for logical replication
List pgsql-hackers
Attaching a new patchset v41 which includes changes by both Peter and myself.

Patches v40-0005 and v40-0006 have been merged to create patch
v41-0005 which reduces the patches to 6 again.
This patch-set contains changes addressing the following review comments:

On Mon, Nov 15, 2021 at 5:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> What I meant was that with this new code we have regressed the old
> behavior. Basically, imagine a case where no filter was given for any
> of the tables. Then after the patch, we will remove all the old tables
> whereas before the patch it will remove the oldrels only when they are
> not specified as part of new rels. If you agree with this, then we can
> retain the old behavior and for the new tables, we can always override
> the where clause for a SET variant of command.

Fixed and modified the behaviour to match with what the schema patch
implemented.

On Mon, Nov 15, 2021 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> 2.
> + * The OptWhereClause (row-filter) must be stored here
> + * but it is valid only for tables. If the ColId was
> + * mistakenly not a table this will be detected later
> + * in preprocess_pubobj_list() and an error thrown.
>
> /error thrown/error is thrown

Fixed.
:
> 6. In rowfilter_expr_checker(), the expression tree is traversed
> twice, can't we traverse it once to detect all non-allowed stuff? It
> can be sometimes costly to traverse the tree multiple times especially
> when the expression is complex and it doesn't seem acceptable to do so
> unless there is some genuine reason for the same.
>

Fixed.

On Tue, Nov 16, 2021 at 7:24 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> doc/src/sgml/ref/create_publication.sgml
> (1) improve comment
> + /* Set up a pstate to parse with */
>
> "pstate" is the variable name, better to use "ParseState".

Fixed.

> src/test/subscription/t/025_row_filter.pl
> (2) rename TAP test 025 to 026
> I suggest that the t/025_row_filter.pl TAP test should be renamed to
> 026 now because 025 is being used by some schema TAP test.
>

Fixed

On Tue, Nov 16, 2021 at 7:50 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> ---
> >If you choose to do the initial table synchronization, only data that satisfies
> >the row filters is sent.
>
> I think this comment is not correct, I think the correct statement
> would be "only data that satisfies the row filters is pulled by the
> subscriber"

Fixed

>
> I think this message is not correct, because for update also we can
> not have filters on the non-key attribute right?  Even w.r.t the first
> patch also if the non update non key toast columns are there we can
> not apply filters on those.  So this comment seems misleading to me.
>

Fixed

>
> -    Oid            relid = RelationGetRelid(targetrel->relation);
> ..
> +    relid = RelationGetRelid(targetrel);
> +
>
> Why this change is required, I mean instead of fetching the relid
> during the variable declaration why do we need to do it separately
> now?
>

Fixed

> +    if (expr == NULL)
> +        ereport(ERROR,
> +                (errcode(ERRCODE_CANNOT_COERCE),
> +                 errmsg("row filter returns type %s that cannot be
> coerced to the expected type %s",
>
> Instead of "coerced to" can we use "cast to"?  That will be in sync
> with other simmilar kind od user exposed error message.
> ----

Fixed

>
> I can see the caller of this function is already switching to
> CacheMemoryContext, so what is the point in doing it again here?
> Maybe if called is expected to do show we can Asssert on the
> CurrentMemoryContext.
>

Fixed.

On Thu, Nov 18, 2021 at 9:36 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> (2) missing test case
> It seems that the current tests are not testing the
> multiple-row-filter case (n_filters > 1) in the following code in
> pgoutput_row_filter_init():
>
>     rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) :
> linitial(rfnodes);
>
> I think a test needs to be added similar to the customers+countries
> example that Tomas gave (where there is a single subscription to
> multiple publications of the same table, each of which has a
> row-filter).

Test case added.

On Fri, Nov 19, 2021 at 4:15 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> I notice that in the 0001 patch, it adds a "relid" member to the
> PublicationRelInfo struct:
>
> src/include/catalog/pg_publication.h
>
>  typedef struct PublicationRelInfo
>  {
> +  Oid relid;
>     Relation relation;
> +  Node     *whereClause;
>  } PublicationRelInfo;
>
> It appears that this new member is not actually required, as the relid
> can be simply obtained from the existing "relation" member - using the
> RelationGetRelid() macro.

Fixed.

On Mon, Nov 22, 2021 at 12:44 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> Another thing I noticed was in the 0004 patch, list_free_deep() should
> be used instead of list_free() in the following code block, otherwise
> the rfnodes themselves (allocated by stringToNode()) are not freed:
>
> src/backend/replication/pgoutput/pgoutput.c
>
> + if (rfnodes)
> + {
> + list_free(rfnodes);
> + rfnodes = NIL;
> + }

Fixed.

We will be addressing the rest of the comments in the next patch.

regards,
Ajin Cherian
Fujitsu Australia

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir
Next
From: Amit Kapila
Date:
Subject: Re: row filtering for logical replication