Re: row filtering for logical replication - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: row filtering for logical replication |
Date | |
Msg-id | CAHut+PsjWbBHTT1dS=ji=PcziQU1+mYqh6G0gFG6AGTCMhTN_g@mail.gmail.com Whole thread Raw |
In response to | Re: row filtering for logical replication ("Euler Taveira" <euler@eulerto.com>) |
Responses |
Re: row filtering for logical replication
Re: row filtering for logical replication |
List | pgsql-hackers |
On Mon, Jul 12, 2021 at 5:39 AM Euler Taveira <euler@eulerto.com> wrote: > > On Fri, Jul 2, 2021, at 4:29 AM, Peter Smith wrote: > > Hi. > > I have been looking at the latest patch set (v16). Below are my review > comments and some patches. > > Peter, thanks for your detailed review. Comments are inline. > Hi Euler, Thanks for addressing my previous review comments. I have reviewed the latest v18 patch. Below are some more review comments and patches. (The patches 0003,0004 are just examples of what is mentioned in my comments; The patches 0001,0002 are there only to try to keep cfbot green). ////////// 1. Commit comment - wording "When a publication is defined or modified, rows that don't satisfy a WHERE clause may be optionally filtered out." => I think this means to say: "Rows that don't satisfy an optional WHERE clause will be filtered out." ------ 2. Commit comment - wording "The row filter is per table, which allows different row filters to be defined for different tables." => I think all that is the same as just saying: "The row filter is per table." ------ 3. PG docs - independent improvement You wrote (ref [1] point 3): "I agree it can be confusing. BTW, CREATE PUBLICATION does not mention that the root partitioned table is used. We should improve that sentence too." I agree, but that PG docs improvement is independent of your RowFilter patch; please make another thread for that idea. ------ 4. doc/src/sgml/ref/create_publication.sgml - independent improvement @@ -131,9 +135,9 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable> on its partitions) contained in the publication will be published using the identity and schema of the partitioned table rather than that of the individual partitions that are actually changed; the - latter is the default. Enabling this allows the changes to be - replicated into a non-partitioned table or a partitioned table - consisting of a different set of partitions. + latter is the default (<literal>false</literal>). Enabling this + allows the changes to be replicated into a non-partitioned table or a + partitioned table consisting of a different set of partitions. </para> I think that Tomas wrote (ref [2] point 2) that this change seems unrelated to your RowFilter patch. I agree; I liked the change, but IMO you need to propose this one in another thread too. ------ 5. doc/src/sgml/ref/create_subscription.sgml - wording @@ -102,7 +102,16 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl <para> Specifies whether the existing data in the publications that are being subscribed to should be copied once the replication starts. - The default is <literal>true</literal>. + The default is <literal>true</literal>. If any table in the + publications has a <literal>WHERE</literal> clause, rows that do not + satisfy the <replaceable class="parameter">expression</replaceable> + will not be copied. If the subscription has several publications in + which a table has been published with different + <literal>WHERE</literal> clauses, rows must satisfy all expressions + to be copied. If any table in the publications has a + <literal>WHERE</literal> clause, data synchronization does not use it + if the subscriber is a <productname>PostgreSQL</productname> version + before 15. I felt that the sentence: "If any table in the publications has a <literal>WHERE</literal> clause, data synchronization does not use it if the subscriber is a <productname>PostgreSQL</productname> version before 15." Could be expressed more simply like: "If the subscriber is a <productname>PostgreSQL</productname> version before 15 then any row filtering is ignored." ------ 6. src/backend/commands/publicationcmds.c - wrong function comment @@ -585,6 +611,9 @@ OpenTableList(List *tables) /* * Close all relations in the list. + * + * Publication node can have a different list element, hence, pub_drop_table + * indicates if it has a Relation (true) or PublicationTable (false). */ static void CloseTableList(List *rels) => The 2nd parameter does not exist in v18, so that comment about pub_drop_table seems to be a cut/paste error from the OpenTableList. ------ src/backend/replication/logical/tablesync.c - bug ? @@ -829,16 +883,23 @@ copy_table(Relation rel) relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock); Assert(rel == relmapentry->localrel); + /* List of columns for COPY */ + attnamelist = make_copy_attnamelist(relmapentry); + /* Start copy on the publisher. */ => I did not understand the above call to make_copy_attnamelist. The result seems unused before it is overwritten later in this same function (??) ------ 7. src/backend/replication/logical/tablesync.c - fetch_remote_table_info enhancement + /* Get relation qual */ + if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000) + { + resetStringInfo(&cmd); + appendStringInfo(&cmd, + "SELECT pg_get_expr(prqual, prrelid) " + " FROM pg_publication p " + " INNER JOIN pg_publication_rel pr " + " ON (p.oid = pr.prpubid) " + " WHERE pr.prrelid = %u " + " AND p.pubname IN (", lrel->remoteid); => I think a small improvement is possible in this SQL. If we change that to "SELECT DISTINCT pg_get_expr(prqual, prrelid)"... then it avoids the copy SQL from having multiple WHERE clauses which are all identical. This could happen when subscribed to multiple publications which had the same filter for the same table. I attached a tmp POC patch for this change and it works as expected. For example, I subscribe to 3 publications, but 2 of them have the same filter for the table. BEFORE COPY (SELECT key, value, data FROM public.test WHERE (key > 0) AND (key > 1000) AND (key > 1000)) TO STDOUT AFTER COPY (SELECT key, value, data FROM public.test WHERE (key > 0) AND (key > 1000) ) TO STDOUT ------ 8. src/backend/replication/pgoutput/pgoutput.c - qual member is redundant @@ -99,6 +108,9 @@ typedef struct RelationSyncEntry bool replicate_valid; PublicationActions pubactions; + List *qual; /* row filter */ + List *exprstate; /* ExprState for row filter */ + TupleTableSlot *scantuple; /* tuple table slot for row filter */ => Now that the exprstate is introduced I think that the other member "qual" is redundant, so it can be removed. FYI - I attached a tmp patch with all the qual references deleted and everything is fine. ------ 9. src/backend/replication/pgoutput/pgoutput.c - comment typo? + /* + * Cache ExprState using CacheMemoryContext. This is the same code as + * ExecPrepareExpr() but it is not used because it doesn't use an EState. + * It should probably be another function in the executor to handle the + * execution outside a normal Plan tree context. + */ => typo: it/that ? I think it ought to say "This is the same code as ExecPrepareExpr() but that is not used because"... ------ 10. src/backend/replication/pgoutput/pgoutput.c - redundant debug logging? + /* Evaluates row filter */ + result = pgoutput_row_filter_exec_expr(exprstate, ecxt); + + elog(DEBUG3, "row filter %smatched", result ? "" : "not "); The above debug logging is really only a repeat (with different wording) of the same information already being logged inside the pgoutput_row_filter_exec_expr function isn't it? Consider removing the redundant logging. e.g. This is already getting logged by pgoutput_row_filter_exec_expr: elog(DEBUG3, "row filter evaluates to %s (isnull: %s)", DatumGetBool(ret) ? "true" : "false", isnull ? "true" : "false"); ------ [1] https://www.postgresql.org/message-id/532a18d8-ce90-4444-8570-8a9fcf09f329%40www.fastmail.com [2] https://www.postgresql.org/message-id/849ee491-bba3-c0ae-cc25-4fce1c03f105%40enterprisedb.com [3] https://www.postgresql.org/message-id/532a18d8-ce90-4444-8570-8a9fcf09f329%40www.fastmail.com Kind Regards, Peter Smith. Fujitsu Australia
Attachment
pgsql-hackers by date: