RE: row filtering for logical replication - Mailing list pgsql-hackers

From houzj.fnst@fujitsu.com
Subject RE: row filtering for logical replication
Date
Msg-id OS0PR01MB571648124C5BAE8B795CEDD9942D9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: row filtering for logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Re: row filtering for logical replication  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Monday, February 7, 2022 3:51 PM Peter Smith <smithpb2250@gmail.com> wrote:
> 
> Hi - I did a review of the v77 patches merged with Amit's v77 diff patch [1].
> 
> (Maybe this is equivalent to reviewing v78)
> 
> Below are my review comments:

Thanks for the comments!

> ======
> 
> 1. doc/src/sgml/ref/create_publication.sgml - CREATE PUBLICATION
> 
> +   The <literal>WHERE</literal> clause allows simple expressions that
> don't have
> +   user-defined functions, operators, collations, non-immutable built-in
> +   functions, or references to system columns.
> +  </para>
> 
> That seems slightly ambiguous for operators and collations. It's only
> the USER-DEFINED ones we don't support.
> 
> Perhaps it should be worded like:
> 
> "allows simple expressions that don't have user-defined functions,
> user-defined operators, user-defined collations, non-immutable
> built-in functions..."

Changed.

> 
> 2. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication
> 
> +Oid
> +GetTopMostAncestorInPublication(Oid puboid, List *ancestors)
> +{
> + ListCell   *lc;
> + Oid topmost_relid = InvalidOid;
> +
> + /*
> + * Find the "topmost" ancestor that is in this publication.
> + */
> + foreach(lc, ancestors)
> + {
> + Oid ancestor = lfirst_oid(lc);
> + List    *apubids = GetRelationPublications(ancestor);
> + List    *aschemaPubids = NIL;
> +
> + if (list_member_oid(apubids, puboid))
> + topmost_relid = ancestor;
> + else
> + {
> + aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
> + if (list_member_oid(aschemaPubids, puboid))
> + topmost_relid = ancestor;
> + }
> +
> + list_free(apubids);
> + list_free(aschemaPubids);
> + }
> +
> + return topmost_relid;
> +}
> 
> Wouldn't it be better for the aschemaPubids to be declared and freed
> inside the else block?

I personally think the current code is clean and the code was borrowed from
Greg's comment[1]. So, I didn't change this.

[1] https://www.postgresql.org/message-id/CAJcOf-c2%2BWbjeP7NhwgcAEtsn9KdDnhrsowheafbZ9%2BQU9C8SQ%40mail.gmail.com

> 
> 3. src/backend/commands/publicationcmds.c - contain_invalid_rfcolumn
> 
> + if (pubviaroot && relation->rd_rel->relispartition)
> + {
> + publish_as_relid = GetTopMostAncestorInPublication(pubid, ancestors);
> +
> + if (publish_as_relid == InvalidOid)
> + publish_as_relid = relid;
> + }
> 
> Consider using the macro code for the InvalidOid check. e.g.
> 
> if (!OidIsValid(publish_as_relid)
> publish_as_relid = relid;
> 

Changed.

> 
> 4. src/backend/commands/publicationcmds.c - IsRowFilterSimpleExpr (Tests)
> 
> + switch (nodeTag(node))
> + {
> + case T_ArrayExpr:
> + case T_BooleanTest:
> + case T_BoolExpr:
> + case T_CaseExpr:
> + case T_CaseTestExpr:
> + case T_CoalesceExpr:
> + case T_CollateExpr:
> + case T_Const:
> + case T_FuncExpr:
> + case T_MinMaxExpr:
> + case T_NullTest:
> + case T_RelabelType:
> + case T_XmlExpr:
> + return true;
> + default:
> + return false;
> + }
> 
> I think there are several missing regression tests.
> 
> 4a. There is a new message that says "User-defined collations are not
> allowed." but I never saw any test case for it.
> 
> 4b. There is also the RelabelType which seems to have no test case.
> Amit previously provided [2] some SQL which would give an unexpected
> error, so I guess that should be a new regression test case. e.g.
> create table t1(c1 int, c2 varchar(100));
> create publication pub1 for table t1 where (c2 < 'john');

I added some tests to cover these nodes.

> 
> 5. src/backend/commands/publicationcmds.c - IsRowFilterSimpleExpr
> (Simple?)
> 
> +/*
> + * Is this a simple Node permitted within a row filter expression?
> + */
> +static bool
> +IsRowFilterSimpleExpr(Node *node)
> +{
> 
> A lot has changed in this area recently and I feel that there is
> something not quite 100% right with the naming and/or logic in this
> expression validation. IMO there are several functions that seem to
> depend too much on each other in special ways...
> 
> IIUC the "walker" logic now seems to be something like this:
> a) Check for special cases of the supported nodes
> b) Then check for supported (simple?) nodes (i.e.
> IsRowFilterSimpleExpr is now acting as a "catch-all" after the special
> case checks)
> c) Then check for some unsupported node embedded within a supported
> node (i.e. call expr_allowed_in_node)
> d) If any of a,b,c was bad then give an error.
> 
> To achieve that logic the T_FuncExpr was added to the
> "IsRowFilterSimpleExpr". Meanwhile, other nodes like
> T_ScalarArrayOpExpr and T_NullIfExpr now are removed from
> IsRowFilterSimpleExpr - I don't quite know why these got removed but
> perhaps there is implicit knowledge that those node kinds were already
> checked by the "walker" before the IsRowFilterSimpleExpr function ever
> gets called.
> 
> So, although I trust that everything is working OK,  I don't think
> IsRowFilterSimpleExpr is really just about simple nodes anymore. It is
> harder to see why some supported nodes are in there, and some
> supported nodes are not. It seems tightly entwined with the logic of
> check_simple_rowfilter_expr_walker; i.e. there seem to be assumptions
> about exactly when it will be called and what was checked before and
> what will be checked after calling it.
> 
> IMO probably all the nodes we are supporting should be in the
> IsRowFilterSimpleExpr just for completeness (e.g. put T_NullIfExpr and
> T_ScalarArrayOpExpr back in there...), and maybe the function should
> be renamed (IsRowFilterAllowedNode?), and probably there need to be
> more comments describing the validation logic (e.g. the a/b/c/d logic
> I mentioned above).

I adjusted these codes by moving all the move back all the nodes handled in
IsRowFilterSimpleExpr back to check_simple_rowfilter_expr_walker() and change
the handling to switch..case.

> 
> 6. src/backend/commands/publicationcmds.c - IsRowFilterSimpleExpr (T_List)
> 
> (From Amit's patch)
> 
> @@ -395,6 +397,7 @@ IsRowFilterSimpleExpr(Node *node)
>   case T_NullTest:
>   case T_RelabelType:
>   case T_XmlExpr:
> + case T_List:
>   return true;
>   default:
>   return false;
> 
> 
> The case T_List should be moved to be alphabetical the same as all the
> other cases.

I reordered these referring to the order as they are defined in nodes.h.

> 
> 7. src/backend/commands/publicationcmds.c -
> contain_mutable_or_ud_functions_checker
> 
> +/* check_functions_in_node callback */
> +static bool
> +contain_mutable_or_ud_functions_checker(Oid func_id, void *context)
> 
> "ud" seems a strange name. Maybe better to name this function
> "contain_mutable_or_user_functions_checker" ?
> 

Changed.

> 
> 8. src/backend/commands/publicationcmds.c - expr_allowed_in_node
> (comment)
> 
> (From Amit's patch)
> 
> @@ -410,6 +413,37 @@ contain_mutable_or_ud_functions_checker(Oid
> func_id, void *context)
>  }
> 
>  /*
> + * Check, if the node contains any unallowed object in node. See
> + * check_simple_rowfilter_expr_walker.
> + *
> + * Returns the error detail meesage in errdetail_msg for unallowed
> expressions.
> + */
> +static bool
> +expr_allowed_in_node(Node *node, ParseState *pstate, char
> **errdetail_msg)
> 
> Remove the comma: "Check, if ..." --> "Check if ..."
> Typo: "meesage" --> "message"
> 

Changed.

> 
> 9. src/backend/commands/publicationcmds.c - expr_allowed_in_node (else)
> 
> (From Amit's patch)
> 
> + if (exprType(node) >= FirstNormalObjectId)
> + *errdetail_msg = _("User-defined types are not allowed.");
> + if (check_functions_in_node(node,
> contain_mutable_or_ud_functions_checker,
> + (void*) pstate))
> + *errdetail_msg = _("User-defined or built-in mutable functions are
> not allowed.");
> + else if (exprCollation(node) >= FirstNormalObjectId)
> + *errdetail_msg = _("User-defined collations are not allowed.");
> + else if (exprInputCollation(node) >= FirstNormalObjectId)
> + *errdetail_msg = _("User-defined collations are not allowed.");
> 
> Is that correct - isn't there a missing "else" on the 2nd "if"?
> 

Changed.

> 
> 10. src/backend/commands/publicationcmds.c - expr_allowed_in_node (bool)
> 
> (From Amit's patch)
> 
> +static bool
> +expr_allowed_in_node(Node *node, ParseState *pstate, char
> **errdetail_msg)
> 
> Why is this a boolean function? It can never return false (??)
> 

Changed.

> 
> 11. src/backend/commands/publicationcmds.c -
> check_simple_rowfilter_expr_walker (else)
> 
> (From Amit's patch)
> 
> @@ -500,12 +519,18 @@ check_simple_rowfilter_expr_walker(Node *node,
> ParseState *pstate)
>   }
>   }
>   }
> - else if (!IsRowFilterSimpleExpr(node))
> + else if (IsRowFilterSimpleExpr(node))
> + {
> + }
> + else
>   {
>   elog(DEBUG3, "row filter contains an unexpected expression
> component: %s", nodeToString(node));
>   errdetail_msg = _("Expressions only allow columns, constants,
> built-in operators, built-in data types, built-in collations and
> immutable built-in functions.");
>   }
> 
> Why introduce a new code block that does nothing?
> 

Changed it to switch ... case which don’t have this problem.

> 
> 12. src/backend/replication/pgoutput/pgoutput.c - get_rel_sync_entry
> 
> + /*
> + * Initialize the row filter after getting the final publish_as_relid
> + * as we only evaluate the row filter of the relation which we publish
> + * change as.
> + */
> + pgoutput_row_filter_init(data, active_publications, entry);
> 
> The comment "which we publish change as" seems strangely worded.
> 
> Perhaps it should be:
> "... only evaluate the row filter of the relation which being published."

Changed.

> 
> 13. src/backend/utils/cache/relcache.c - RelationBuildPublicationDesc
> (release)
> 
> + /*
> + * Check if all columns referenced in the filter expression are part of
> + * the REPLICA IDENTITY index or not.
> + *
> + * If the publication is FOR ALL TABLES then it means the table has no
> + * row filters and we can skip the validation.
> + */
> + if (!pubform->puballtables &&
> + (pubform->pubupdate || pubform->pubdelete) &&
> + contain_invalid_rfcolumn(pubid, relation, ancestors,
> + pubform->pubviaroot))
> + {
> + if (pubform->pubupdate)
> + pubdesc->rf_valid_for_update = false;
> + if (pubform->pubdelete)
> + pubdesc->rf_valid_for_delete = false;
> + }
> 
>   ReleaseSysCache(tup);
> 
> This change has the effect of moving the location of the
> "ReleaseSysCache(tup);" to much lower in the code but I think there is
> no point to move it for the Row Filter patch, so it should be left
> where it was before.

The newly added code here refers to the 'pubform' which comes from the ' tup',
So I think we should release the tuple after these codes.

> 
> 14. src/backend/utils/cache/relcache.c - RelationBuildPublicationDesc
> (if refactor)
> 
> - if (pubactions->pubinsert && pubactions->pubupdate &&
> - pubactions->pubdelete && pubactions->pubtruncate)
> + if (pubdesc->pubactions.pubinsert && pubdesc->pubactions.pubupdate
> &&
> + pubdesc->pubactions.pubdelete && pubdesc->pubactions.pubtruncate
> &&
> + !pubdesc->rf_valid_for_update && !pubdesc->rf_valid_for_delete)
>   break;
> 
> I felt that the "rf_valid_for_update" and "rf_valid_for_delete" should
> be checked first in that if condition. It is probably more optimal to
> move them because then it can bail out early. All those other
> pubaction flags are more likely to be true most of the time (because
> that is the default case).

I don't have a strong opinion on this, I feel it's fine to put the newly added
check at the end as it doesn't bring notable performance impact.

> 
> 15. src/bin/psql/describe.c - SQL format
> 
> @@ -2898,12 +2902,12 @@ describeOneTableDetails(const char
> *schemaname,
>   else
>   {
>   printfPQExpBuffer(&buf,
> -   "SELECT pubname\n"
> +   "SELECT pubname, NULL\n"
>     "FROM pg_catalog.pg_publication p\n"
>     "JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
>     "WHERE pr.prrelid = '%s'\n"
>     "UNION ALL\n"
> -   "SELECT pubname\n"
> +   "SELECT pubname, NULL\n"
>     "FROM pg_catalog.pg_publication p\n"
> 
> I thought it may be better to reformat to put the NULL columns on a
> different line for consistent format with the other SQL just above
> this one. e.g.
> 
>   printfPQExpBuffer(&buf,
>     "SELECT pubname\n"
> +   " , NULL\n"

Changed.

Attach the V79 patch set which addressed the above comments and adjust some
comments related to expression check.

Best regards,
Hou zj

Attachment

pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Add tag/category to the commitfest app
Next
From: Tom Lane
Date:
Subject: Re: [RFC] building postgres with meson - perl embedding