Re: row filtering for logical replication - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: row filtering for logical replication |
Date | |
Msg-id | CAHut+PuLoZuHD_A=n8GshC84Nc=8guReDsTmV1RFsCYojssD8Q@mail.gmail.com Whole thread Raw |
In response to | Re: row filtering for logical replication (Peter Smith <smithpb2250@gmail.com>) |
List | pgsql-hackers |
Hi Euler, Greg noticed that your patch set was missing any implementation of the psql tab auto-complete for the new row filter WHERE syntax. So I have added a POC patch for this missing feature. Unfortunately, there is an existing HEAD problem overlapping with this exact same code. I reported this already in another thread [1]. So there are 2 patches attached here: 0001 - Fixes the other reported problem (I hope this may be pushed soon) 0002 - Adds the tab-completion code for your row filter WHERE's ------ [1] https://www.postgresql.org/message-id/CAHut+Ps-vkmnWAShWSRVCB3gx8aM=bFoDqWgBNTzofK0q1LpwA@mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia On Tue, Jul 13, 2021 at 1:25 PM Peter Smith <smithpb2250@gmail.com> wrote: > > On Mon, Jul 12, 2021 at 5:39 AM Euler Taveira <euler@eulerto.com> wrote: > > > > On Fri, Jul 2, 2021, at 4:29 AM, Peter Smith wrote: > > > > Hi. > > > > I have been looking at the latest patch set (v16). Below are my review > > comments and some patches. > > > > Peter, thanks for your detailed review. Comments are inline. > > > > Hi Euler, > > Thanks for addressing my previous review comments. > > I have reviewed the latest v18 patch. Below are some more review > comments and patches. > > (The patches 0003,0004 are just examples of what is mentioned in my > comments; The patches 0001,0002 are there only to try to keep cfbot > green). > > ////////// > > 1. Commit comment - wording > > "When a publication is defined or modified, rows that don't satisfy a > WHERE clause may be > optionally filtered out." > > => > > I think this means to say: "Rows that don't satisfy an optional WHERE > clause will be filtered out." > > ------ > > 2. Commit comment - wording > > "The row filter is per table, which allows different row filters to be > defined for different tables." > > => > > I think all that is the same as just saying: "The row filter is per table." > > ------ > > 3. PG docs - independent improvement > > You wrote (ref [1] point 3): > > "I agree it can be confusing. BTW, CREATE PUBLICATION does not mention that the > root partitioned table is used. We should improve that sentence too." > > I agree, but that PG docs improvement is independent of your RowFilter > patch; please make another thread for that idea. > > ------ > > 4. doc/src/sgml/ref/create_publication.sgml - independent improvement > > @@ -131,9 +135,9 @@ CREATE PUBLICATION <replaceable > class="parameter">name</replaceable> > on its partitions) contained in the publication will be published > using the identity and schema of the partitioned table rather than > that of the individual partitions that are actually changed; the > - latter is the default. Enabling this allows the changes to be > - replicated into a non-partitioned table or a partitioned table > - consisting of a different set of partitions. > + latter is the default (<literal>false</literal>). Enabling this > + allows the changes to be replicated into a non-partitioned table or a > + partitioned table consisting of a different set of partitions. > </para> > > I think that Tomas wrote (ref [2] point 2) that this change seems > unrelated to your RowFilter patch. > > I agree; I liked the change, but IMO you need to propose this one in > another thread too. > > ------ > > 5. doc/src/sgml/ref/create_subscription.sgml - wording > > @@ -102,7 +102,16 @@ CREATE SUBSCRIPTION <replaceable > class="parameter">subscription_name</replaceabl > <para> > Specifies whether the existing data in the publications that are > being subscribed to should be copied once the replication starts. > - The default is <literal>true</literal>. > + The default is <literal>true</literal>. If any table in the > + publications has a <literal>WHERE</literal> clause, rows that do not > + satisfy the <replaceable class="parameter">expression</replaceable> > + will not be copied. If the subscription has several publications in > + which a table has been published with different > + <literal>WHERE</literal> clauses, rows must satisfy all expressions > + to be copied. If any table in the publications has a > + <literal>WHERE</literal> clause, data synchronization does not use it > + if the subscriber is a <productname>PostgreSQL</productname> version > + before 15. > > I felt that the sentence: "If any table in the publications has a > <literal>WHERE</literal> clause, data synchronization does not use it > if the subscriber is a <productname>PostgreSQL</productname> version > before 15." > > Could be expressed more simply like: "If the subscriber is a > <productname>PostgreSQL</productname> version before 15 then any row > filtering is ignored." > > ------ > > 6. src/backend/commands/publicationcmds.c - wrong function comment > > @@ -585,6 +611,9 @@ OpenTableList(List *tables) > > /* > * Close all relations in the list. > + * > + * Publication node can have a different list element, hence, pub_drop_table > + * indicates if it has a Relation (true) or PublicationTable (false). > */ > static void > CloseTableList(List *rels) > > => > > The 2nd parameter does not exist in v18, so that comment about > pub_drop_table seems to be a cut/paste error from the OpenTableList. > > ------ > > src/backend/replication/logical/tablesync.c - bug ? > > @@ -829,16 +883,23 @@ copy_table(Relation rel) > relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock); > Assert(rel == relmapentry->localrel); > > + /* List of columns for COPY */ > + attnamelist = make_copy_attnamelist(relmapentry); > + > /* Start copy on the publisher. */ > => > > I did not understand the above call to make_copy_attnamelist. The > result seems unused before it is overwritten later in this same > function (??) > > ------ > > 7. src/backend/replication/logical/tablesync.c - > fetch_remote_table_info enhancement > > + /* Get relation qual */ > + if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000) > + { > + resetStringInfo(&cmd); > + appendStringInfo(&cmd, > + "SELECT pg_get_expr(prqual, prrelid) " > + " FROM pg_publication p " > + " INNER JOIN pg_publication_rel pr " > + " ON (p.oid = pr.prpubid) " > + " WHERE pr.prrelid = %u " > + " AND p.pubname IN (", lrel->remoteid); > > => > > I think a small improvement is possible in this SQL. > > If we change that to "SELECT DISTINCT pg_get_expr(prqual, prrelid)"... > then it avoids the copy SQL from having multiple WHERE clauses which > are all identical. This could happen when subscribed to multiple > publications which had the same filter for the same table. > > I attached a tmp POC patch for this change and it works as expected. > For example, I subscribe to 3 publications, but 2 of them have the > same filter for the table. > > BEFORE > COPY (SELECT key, value, data FROM public.test WHERE (key > 0) AND > (key > 1000) AND (key > 1000)) TO STDOUT > > AFTER > COPY (SELECT key, value, data FROM public.test WHERE (key > 0) AND > (key > 1000) ) TO STDOUT > > ------ > > 8. src/backend/replication/pgoutput/pgoutput.c - qual member is redundant > > @@ -99,6 +108,9 @@ typedef struct RelationSyncEntry > > bool replicate_valid; > PublicationActions pubactions; > + List *qual; /* row filter */ > + List *exprstate; /* ExprState for row filter */ > + TupleTableSlot *scantuple; /* tuple table slot for row filter */ > > => > > Now that the exprstate is introduced I think that the other member > "qual" is redundant, so it can be removed. > > FYI - I attached a tmp patch with all the qual references deleted and > everything is fine. > > ------ > > 9. src/backend/replication/pgoutput/pgoutput.c - comment typo? > > + /* > + * Cache ExprState using CacheMemoryContext. This is the same code as > + * ExecPrepareExpr() but it is not used because it doesn't use an EState. > + * It should probably be another function in the executor to handle the > + * execution outside a normal Plan tree context. > + */ > > => > > typo: it/that ? > > I think it ought to say "This is the same code as ExecPrepareExpr() > but that is not used because"... > > ------ > > 10. src/backend/replication/pgoutput/pgoutput.c - redundant debug logging? > > + /* Evaluates row filter */ > + result = pgoutput_row_filter_exec_expr(exprstate, ecxt); > + > + elog(DEBUG3, "row filter %smatched", result ? "" : "not "); > > The above debug logging is really only a repeat (with different > wording) of the same information already being logged inside the > pgoutput_row_filter_exec_expr function isn't it? Consider removing the > redundant logging. > > e.g. This is already getting logged by pgoutput_row_filter_exec_expr: > > elog(DEBUG3, "row filter evaluates to %s (isnull: %s)", > DatumGetBool(ret) ? "true" : "false", > isnull ? "true" : "false"); > > > ------ > [1] https://www.postgresql.org/message-id/532a18d8-ce90-4444-8570-8a9fcf09f329%40www.fastmail.com > [2] https://www.postgresql.org/message-id/849ee491-bba3-c0ae-cc25-4fce1c03f105%40enterprisedb.com > [3] https://www.postgresql.org/message-id/532a18d8-ce90-4444-8570-8a9fcf09f329%40www.fastmail.com > > Kind Regards, > Peter Smith. > Fujitsu Australia
Attachment
pgsql-hackers by date: