Re: PG DOCS - logical replication filtering - Mailing list pgsql-hackers

From Euler Taveira
Subject Re: PG DOCS - logical replication filtering
Date
Msg-id 1c78ebd4-b38d-4b5d-a6ea-d583efe87d97@www.fastmail.com
Whole thread Raw
In response to Re: PG DOCS - logical replication filtering  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: PG DOCS - logical replication filtering  (Amit Kapila <amit.kapila16@gmail.com>)
Re: PG DOCS - logical replication filtering  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Mon, Apr 11, 2022, at 7:40 AM, Amit Kapila wrote:
On Mon, Apr 11, 2022 at 12:39 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Fri, Apr 8, 2022 at 4:12 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> OK. Added in v7 [1]
>

Thanks, this looks mostly good to me. I didn't like the new section
added for partitioned tables examples, so I removed it and added some
explanation of the tests. I have slightly changed a few other lines. I
am planning to commit the attached tomorrow unless there are more
comments.
I have a few comments.

> If a publication publishes UPDATE and/or DELETE operations ...
>

If we are talking about operations, use lowercase like I suggested in the
previous version. See the publish parameter [1]. If we are talking about
commands, the word "operations" should be removed or replaced by "commands".
The "and/or" isn't required, "or" implies the same. If you prefer "operations",
my suggestion is to adjust the last sentence that says "only INSERT" to "only
<italic>insert</italic> operation".

> If the old row satisfies the row filter expression (it was sent to the
> subscriber) but the new row doesn't, then from a data consistency perspective
> the old row should be removed from the subscriber.
>

I suggested small additions to this sentence. We should at least add a comma
after "then" and "perspective". The same applies to the next paragraph too.

Regarding the "Summary", it is redundant as I said before. We already described
all 4 cases. I vote to remove it. However, if we would go with a table, I
suggest to improve the formatting: add borders, "old row" and "new row" should
be titles, "no match" and "match" should be represented by symbols ("" and "X",
for example), and "Case X" column should be removed because this extra column
adds nothing.

Regarding the "Partitioned Tables", I suggested to remove the bullets. We
generally have paragraphs with a few sentences. I tend to avoid short
paragraphs. I also didn't like the 2 bullets using almost the same words. In
the previous version, I suggested something like

If the publication contains a partitioned table, the parameter
publish_via_partition_root determines which row filter expression is used. If
the parameter publish_via_partition_root is true, the row filter expression
associated with the partitioned table is used. Otherwise, the row filter
expression associated with the individual partition is used.

> will be copied. (see Section 31.3.6 for details).

There is an extra period after "copied" that should be removed. The other
option is to remove the parentheses and have another sentence for "See ...".

> those expressions get OR'ed together

I prefer plain English here. This part of the sentence is also redundant with
the rest of the sentence so I suggested to remove it in the previous version.

rows that satisfy any of the row filter expressions is replicated.

instead of

those expressions get OR'ed together, so that rows satisfying any of the
expressions will be replicated.

I also didn't use a different paragraph (like I suggested in the previous
version) because we are talking about the same thing.

The bullets in the example sounds strange, that's why I suggested removing it.
We can even combine the 3 sentences into one paragraph.

> The PSQL command \dRp+ shows the row filter expressions (if defined) for each
> table of the publications.

Well, we don't use PSQL (upppercase) in the documentation. I suggested a
different sentence:

The psql shows the row filter expressions (if defined) for each table.

> The PSQL command \d shows what publications the table is a member of, as well
> as that table's row filter expression (if defined) in those publications.

It is not logical replication business to explain about psql behavior. If, for
some reason, someone decided to change it, this section will contain obsolete
information. The psql output is fine, the explanation is not. That's why I
suggested this modification.

> Only the rows satisfying the t1 WHERE clause of publication p1 are
> replicated.

Again, no bullets. This sentence is useful *before* the psql output. We're
presenting the results. Let's follow the pattern, describe the action and show
the results.

> The UPDATE replicates the change as normal.

This sentence should be *before* the psql output (see my previous version).

Regarding the new examples (for partitioned tables), shouldn't we move the
parent / child definitions to the beginning of the Examples section? It seems
confusing use the same code snippet to show repeated table definitions
(publisher and subscriber). I checked fast and after a few seconds I realized
that the example is not wrong but the database name has a small difference (one
letter "s" x "p"). The publication and subscription definitions are fine there.

I think reusing the same tables and publication introduces complexity.
Shouldn't we just use different tables and publication to provide an "easy"
example? It would avoid DROP PUBLICATION, ALTER SUBSCRIPTION and TRUNCATE.

> Do the inserts same as before.

We should indicate the node (publisher) to be clear.



--
Euler Taveira

pgsql-hackers by date:

Previous
From: Zheng Li
Date:
Subject: Re: Support logical replication of DDLs
Next
From: Zheng Li
Date:
Subject: Re: Support logical replication of DDLs