Home > mailing lists

Re: PG DOCS - logical replication filtering - Mailing list pgsql-hackers

From	Peter Smith
Subject	Re: PG DOCS - logical replication filtering
Date	April 8, 2022 09:12:07
Msg-id	CAHut+PvDFWGUOBugYMtcXhAiViZu+Q6P-kxw2+U83VOGx0Osdg@mail.gmail.com Whole thread Raw
In response to	Re: PG DOCS - logical replication filtering (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: PG DOCS - logical replication filtering (Amit Kapila <amit.kapila16@gmail.com>) Re: PG DOCS - logical replication filtering (Peter Smith <smithpb2250@gmail.com>)
List	pgsql-hackers

Tree view

On Wed, Apr 6, 2022 at 8:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Mar 24, 2022 at 11:48 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
>
> Review comments:
> ===============
> 1.
> +    The <literal>WHERE</literal> clause expression is evaluated with the same
> +    role used for the replication connection. i.e. the role specified in the
> +    <literal>CONNECTION</literal> clause of the <xref
> linkend="sql-createsubscription"/>.
>
> Can we use () around i.e. sentence? It looks bit odd to me now.
> The <literal>WHERE</literal> clause expression is evaluated with the
> same role used for the replication connection (i.e., the role
> specified in the <literal>CONNECTION</literal> clause of the <xref
> linkend="sql-createsubscription"/>).

OK. Modified in v6 [1].

> 2.
> +   <para>
> +    Whenever an <command>UPDATE</command> is processed, the row filter
> +    expression is evaluated for both the old and new row (i.e. before
> +    and after the data is updated).
>
> Can we write the below part slightly differently?
> Before:
> (i.e. before and after the data is updated).
> After:
> (i.e the data before and after the update).

OK. Modified in v6 [1].

> 3.
> +   <para>
> +    Whenever an <command>UPDATE</command> is processed, the row filter
> +    expression is evaluated for both the old and new row (i.e. before
> +    and after the data is updated).
> +   </para>
> +
> +   <para>
> +    If both evaluations are <literal>true</literal>, it replicates the
> +    <command>UPDATE</command> change.
> +   </para>
> +
> +   <para>
> +    If both evaluations are <literal>false</literal>, it doesn't replicate
> +    the change.
> +   </para>
>
> I think we can combine these three separate paragraphs.

The first sentence is the explanation, then there are the 3 separate
“If …” cases mentioned. It doesn’t seem quite right to group just the
first 2 “if…” cases into one paragraph, while leaving the 3rd one
separated. OTOH combining everything into one big paragraph seems
worse. Please confirm if you still want this changed.

> 4.
> +   <para>
> +    If the publication contains a partitioned table, the publication parameter
> +    <literal>publish_via_partition_root</literal> determines which row filter
> +    is used.
> +    <itemizedlist>
> +
> +     <listitem>
> +      <para>
> +       If <literal>publish_via_partition_root</literal> is
> <literal>false</literal>
> +       (default), each <emphasis>partition's</emphasis> row filter is used.
> +      </para>
> +     </listitem>
> +
> +     <listitem>
> +      <para>
> +       If <literal>publish_via_partition_root</literal> is
> <literal>true</literal>,
> +       the <emphasis>root partitioned table's</emphasis> row filter is used.
> +      </para>
> +     </listitem>
> +
> +    </itemizedlist>
> +   </para>
>
> I think we can combine this into single para as Euler had in his version.

We can do it, but I am not sure if your review was looking at the
rendered HTML or just looking at the SGML text? IMO using bullets here
ended up being more readable (it is also consistent with other bullet
usages on this page). Please confirm if you still want this changed.

> 5.
> +   <note>
> +    <para>
> +     Publication <literal>publish</literal> operations are ignored
> when copying pre-existing table data.
> +    </para>
> +   </note>
> +
> +   <note>
> +    <para>
> +     If the subscriber is in a release prior to 15, copy pre-existing data
> +     doesn't use row filters even if they are defined in the publication.
> +     This is because old releases can only copy the entire table data.
> +    </para>
> +   </note>
>
> I don't see the need for the first <note> here, the second one seems
> to convey it.

Well, the 2nd note is only about compatibility with older versions
doing the subscribe. But the 1st note is not version-specific at all.
It is saying that the COPY does not take the “publish” option into
account. If you know of some other docs already mentioning this subtle
behaviour of the COPY then I can remove this note and just
cross-reference to the other place. But I did not know anywhere this
is already mentioned, so that is why I wrote the note about it.

> 6.
> +   <para>
> +    Create some tables to be used in the following examples.
> +<programlisting>
> +testpub=# CREATE TABLE t1(a int, b int, c text, primary key(a,c));
> +CREATE TABLE
> +testpub=# CREATE TABLE t2(d int primary key, e int, f int);
> +CREATE TABLE
> +testpub=# CREATE TABLE t3(g int primary key, h int, i int);
> +CREATE TABLE
> +</programlisting>
> +   </para>
> +
> +   <para>
> +    Create some publications.
> +   </para>
> +   <para>
> +    - notice publication <literal>p1</literal> has 1 table with a row filter.
> +   </para>
> +   <para>
> +    - notice publication <literal>p2</literal> has 2 tables, one without a row
> +    filter, and one with a row filter.
> +   </para>
> +   <para>
> +    - notice publication <literal>p3</literal> has 2 tables, both
> with row filters.
> +<programlisting>
> +testpub=# CREATE PUBLICATION p1 FOR TABLE t1 WHERE (a > 5 AND c = 'NSW');
> +CREATE PUBLICATION
> +testpub=# CREATE PUBLICATION p2 FOR TABLE t1, t2 WHERE (e = 99);
> +CREATE PUBLICATION
> +testpub=# CREATE PUBLICATION p3 FOR TABLE t2 WHERE (d = 10), t3 WHERE (g = 10);
> +CREATE PUBLICATION
> +</programlisting>
> +   </para>
>
> I think it is better to use the corresponding content from Euler's version.

OK. Modified in v6 [1]. I changed the primary key syntax to be the
same as Euler had. I also moved all the 'notice' parts to below the
corresponding example and modified the text.

>
> 7.
> +   <para>
> +    The PSQL command <command>\d</command> shows what publications the table is
> +    a member of, as well as that table's row filter expression (if defined) in
> +    those publications.
> +   </para>
> +   <para>
> +    - notice table <literal>t1</literal> is a member of 2 publications, but
> +    has a row filter only in <literal>p1</literal>.
> +   </para>
> +   <para>
> +    - notice table <literal>t2</literal> is a member of 2 publications, and
> +    has a different row filter in each of them.
>
> This looks unnecessary to me. Let's remove this part.

I thought something is needed to explain/demonstrate how the user can
know the different row filters for all the publications that the same
table is a member of. Otherwise, the user has to guess (??) what
publications are using their table and then use \dRp+ to dig at all
those publications to find the row filters.

I can remove all this part from the Examples, but I think at least the
\d should still be mentioned somewhere. IMO I should put that "PSQL
commands"  section back (which existed in an earlier version) and just
add a sentence about this. Then this examples part can be removed.
What do you think?

> 8.
> +   <para>
> +    - notice that only the rows satisfying the <literal>t1 WHERE</literal>
> +    clause of publication <literal>p1</literal> are replicated.
>
> Again, it is better to use Euler's version for this and at the place,
> he had in his version. Similarly, adjust other notices if any like
> this one.

OK. Modified in v6 [1]. Every “notice” has now been moved to follow
the associated example (how Euler had them)


> 9. I suggest adding an example for partition tables showing how the
> row filter is used based on the 'publish_via_partition_root'
> parameter.

OK - I am working on this and will add it in a future patch version.

------
[1] https://www.postgresql.org/message-id/CAHut%2BPvyxMedYY-jHaT9YSfEPHv0jU2-CZ8F_nPvhuP0b955og%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

pgsql-hackers by date:

From: Peter Smith
Date: 08 April 2022, 08:53:38
Subject: Re: PG DOCS - logical replication filtering

From: Laetitia Avrot
Date: 08 April 2022, 09:36:13
Subject: Re: pg_dump new feature: exporting functions only. Bad or good idea ?

Re: PG DOCS - logical replication filtering - Mailing list pgsql-hackers

Previous

Next