Re: row filtering for logical replication - Mailing list pgsql-hackers

From Euler Taveira
Subject Re: row filtering for logical replication
Date
Msg-id 3ddceeeb-c6b6-419f-8d11-4f61eb3c3d46@www.fastmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Sun, Jul 11, 2021, at 8:09 PM, Tomas Vondra wrote:
I took a look at this patch, which seems to be in CF since 2018. I have 
only some basic comments and observations at this point:
Tomas, thanks for reviewing this patch again.

1) alter_publication.sgml

I think "expression is executed" sounds a bit strange, perhaps 
"evaluated" would be better?
Fixed.

2) create_publication.sgml

Why is the patch changing publish_via_partition_root docs? That seems 
like a rather unrelated bit.
Removed. I will submit a separate patch for this.

    The <literal>WHERE</literal> clause should probably contain only
    columns that are part of the primary key or be covered by
    <literal>REPLICA ...

I'm not sure what exactly is this trying to say. What does "should 
probably ..." mean in practice for the users? Does that mean something 
bad will happen for other columns, or what? I'm sure this wording will 
be quite confusing for users.
Reading again it seems "probably" is confusing. Let's remove it.

It may also be unclear whether the condition is evaluated on the old or 
new row, so perhaps add an example illustrating that & more detailed 
comment, or something. E.g. what will happen with

UPDATE departments SET active = false WHERE active;
Yeah. I avoided to mention this internal detail about old/new row but it seems
better to be clear. How about the following paragraph?

  <para>
   The <literal>WHERE</literal> clause should contain only columns that are
   part of the primary key or be covered  by <literal>REPLICA
   IDENTITY</literal> otherwise, <command>DELETE</command> operations will not
   be replicated. That's because old row is used and it only contains primary
   key or columns that are part of the <literal>REPLICA IDENTITY</literal>; the
   remaining columns are <literal>NULL</literal>. For <command>INSERT</command>
   and <command>UPDATE</command> operations, any column might be used in the
   <literal>WHERE</literal> clause. New row is used and it contains all
   columns. A <literal>NULL</literal> value causes the expression to evaluate
   to false; avoid using columns without not-null constraints in the
   <literal>WHERE</literal> clause. The <literal>WHERE</literal> clause does
   not allow functions and user-defined operators.
  </para>

3) publication_add_relation

Does this need to build the parse state even for whereClause == NULL?
No. Fixed.

4) AlterPublicationTables

I wonder if this new reworked code might have issues with subscriptions 
containing many tables, but I haven't tried.
This piece of code is already complicated. Amit complained about it too [1].
Are you envisioning any specific issue (other than open thousands of relations,
do some stuff, and close them all)? IMO the open/close relation should be
postponed for as long as possible.

5) OpenTableList

I really dislike that the list can have two different node types 
(Relation and PublicationTable). In principle we don't actually need the 
extra flag, we can simply check the node type directly by IsA() and act 
based on that. However, I think it'd be better to just use a single node 
type from all places.
Amit complained about having a runtime test for ALTER PUBLICATION ... DROP
TABLE in case user provides a WHERE clause [2]. I did that way (runtime test)
because it simplified the code. I would tend to avoid moving grammar task into
a runtime, that's why I agreed to change it. I didn't like the multi-node
argument handling for OpenTableList() (mainly because of the extra argument in
the function signature) but with your suggestion (IsA()) maybe it is
acceptable. What do you think? I included IsA() in v19.

I don't see why not to set whereClause every time, I don't think the 
extra if saves anything, it's just a bit more complex.
See runtime test in [2].


5) CloseTableList

The comment about node types seems pointless, this function has no flag 
and the element type does not matter.
Fixed.

6) parse_agg.c

    ... are not allowed in publication WHERE expressions

I think all similar cases use "WHERE conditions" instead.
No. Policy, index, statistics, partition, column generation use expressions.
COPY and trigger use conditions. It is also referred as expression in the
synopsis.

7) transformExprRecurse

The check at the beginning seems rather awkward / misplaced - it's way 
too specific for this location (there are no other p_expr_kind 
references in this function). Wouldn't transformFuncCall (or maybe 
ParseFuncOrColumn) be a more appropriate place?
Probably. I have to try the multiple possibilities to make sure it forbids all
cases.

Initially I was wondering why not to allow function calls in WHERE 
conditions, but I see that was discussed in the past as problematic. But 
that reminds me that I don't see any docs describing what expressions 
are allowed in WHERE conditions - maybe we should explicitly list what 
expressions are allowed?
I started to investigate how to safely allow built-in functions. There is a
long discussion about using functions in a logical decoding context. As I said
during the last CF for v14, I prefer this to be a separate feature. I realized
that I mentioned that functions and user-defined operators are not allowed in
the commit message but forgot to mention it in the documentation.

8) pgoutput.c

I have not reviewed this in detail yet, but there seems to be something 
wrong because `make check-world` fails in subscription/010_truncate.pl 
after hitting an assert  (backtrace attached) during "START_REPLICATION 
SLOT" in get_rel_sync_entry in this code:
That's because I didn't copy the TupleDesc in CacheMemoryContext. Greg pointed
it too in a previous email [3]. The new patch (v19) includes a fix for it.



--
Euler Taveira

pgsql-hackers by date:

Previous
From: "Euler Taveira"
Date:
Subject: Re: row filtering for logical replication
Next
From: Tom Lane
Date:
Subject: Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails