Re: PG DOCS - logical replication filtering - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: PG DOCS - logical replication filtering
Date
Msg-id CAA4eK1JPyVoc1dUjeqbPd9D0_uYxWyyx-8fcsrgiZ5Tpr9OAuw@mail.gmail.com
Whole thread Raw
In response to Re: PG DOCS - logical replication filtering  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: PG DOCS - logical replication filtering
List pgsql-hackers
On Fri, Apr 8, 2022 at 11:42 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Wed, Apr 6, 2022 at 8:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > 3.
> > +   <para>
> > +    Whenever an <command>UPDATE</command> is processed, the row filter
> > +    expression is evaluated for both the old and new row (i.e. before
> > +    and after the data is updated).
> > +   </para>
> > +
> > +   <para>
> > +    If both evaluations are <literal>true</literal>, it replicates the
> > +    <command>UPDATE</command> change.
> > +   </para>
> > +
> > +   <para>
> > +    If both evaluations are <literal>false</literal>, it doesn't replicate
> > +    the change.
> > +   </para>
> >
> > I think we can combine these three separate paragraphs.
>
> The first sentence is the explanation, then there are the 3 separate
> “If …” cases mentioned. It doesn’t seem quite right to group just the
> first 2 “if…” cases into one paragraph, while leaving the 3rd one
> separated. OTOH combining everything into one big paragraph seems
> worse. Please confirm if you still want this changed.
>

Yeah, I think we should have something like what Euler's version had
and maybe keep a summary section from the current patch.

> > 4.
> > +   <para>
> > +    If the publication contains a partitioned table, the publication parameter
> > +    <literal>publish_via_partition_root</literal> determines which row filter
> > +    is used.
> > +    <itemizedlist>
> > +
> > +     <listitem>
> > +      <para>
> > +       If <literal>publish_via_partition_root</literal> is
> > <literal>false</literal>
> > +       (default), each <emphasis>partition's</emphasis> row filter is used.
> > +      </para>
> > +     </listitem>
> > +
> > +     <listitem>
> > +      <para>
> > +       If <literal>publish_via_partition_root</literal> is
> > <literal>true</literal>,
> > +       the <emphasis>root partitioned table's</emphasis> row filter is used.
> > +      </para>
> > +     </listitem>
> > +
> > +    </itemizedlist>
> > +   </para>
> >
> > I think we can combine this into single para as Euler had in his version.
>
> We can do it, but I am not sure if your review was looking at the
> rendered HTML or just looking at the SGML text? IMO using bullets here
> ended up being more readable (it is also consistent with other bullet
> usages on this page). Please confirm if you still want this changed.
>

Fair enough. We can keep this part as it is.

> > 5.
> > +   <note>
> > +    <para>
> > +     Publication <literal>publish</literal> operations are ignored
> > when copying pre-existing table data.
> > +    </para>
> > +   </note>
> > +
> > +   <note>
> > +    <para>
> > +     If the subscriber is in a release prior to 15, copy pre-existing data
> > +     doesn't use row filters even if they are defined in the publication.
> > +     This is because old releases can only copy the entire table data.
> > +    </para>
> > +   </note>
> >
> > I don't see the need for the first <note> here, the second one seems
> > to convey it.
>
> Well, the 2nd note is only about compatibility with older versions
> doing the subscribe. But the 1st note is not version-specific at all.
> It is saying that the COPY does not take the “publish” option into
> account. If you know of some other docs already mentioning this subtle
> behaviour of the COPY then I can remove this note and just
> cross-reference to the other place. But I did not know anywhere this
> is already mentioned, so that is why I wrote the note about it.
>

I don't see the need to say about general initial sync (COPY) behavior
here. It is already defined at [1]. If we want to enhance, we can do
that as a separate patch to make changes where the initial sync is
explained. I am not sure that is required though.

> >
> > 7.
> > +   <para>
> > +    The PSQL command <command>\d</command> shows what publications the table is
> > +    a member of, as well as that table's row filter expression (if defined) in
> > +    those publications.
> > +   </para>
> > +   <para>
> > +    - notice table <literal>t1</literal> is a member of 2 publications, but
> > +    has a row filter only in <literal>p1</literal>.
> > +   </para>
> > +   <para>
> > +    - notice table <literal>t2</literal> is a member of 2 publications, and
> > +    has a different row filter in each of them.
> >
> > This looks unnecessary to me. Let's remove this part.
>
> I thought something is needed to explain/demonstrate how the user can
> know the different row filters for all the publications that the same
> table is a member of. Otherwise, the user has to guess (??) what
> publications are using their table and then use \dRp+ to dig at all
> those publications to find the row filters.
>
> I can remove all this part from the Examples, but I think at least the
> \d should still be mentioned somewhere. IMO I should put that "PSQL
> commands"  section back (which existed in an earlier version) and just
> add a sentence about this. Then this examples part can be removed.
> What do you think?
>

I think the way it is changed in the current patch by moving that
explanation down seems okay to me.

I feel in the initial "Row Filters" and "Row Filter Rules" sections,
we don't need to have separate paragraphs. I think the same is pointed
out by Alvaro as well.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: generic plans and "initial" pruning
Next
From: Zhihong Yu
Date:
Subject: Re: generic plans and "initial" pruning