Re: Column Filtering in Logical Replication - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Column Filtering in Logical Replication
Date
Msg-id CAHut+Ps2RgRGbjj90NGK3xuHgHOP5zD_fuRXMhGNVhLW0T_+zA@mail.gmail.com
Whole thread Raw
In response to Re: Column Filtering in Logical Replication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Column Filtering in Logical Replication
List pgsql-hackers
On Thu, Sep 1, 2022 at 7:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 26, 2022 at 7:33 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
>
> Few comments on both the patches:
> v4-0001*
> =========
> 1.
> Furthermore, if the table uses
> +   <literal>REPLICA IDENTITY FULL</literal>, specifying a column list is not
> +   allowed (it will cause publication errors for <command>UPDATE</command> or
> +   <command>DELETE</command> operations).
>
> This line sounds a bit unclear to me. From this like it appears that
> the following operation is not allowed:
>
> postgres=# create table t1(c1 int, c2 int, c3 int);
> CREATE TABLE
> postgres=# Alter Table t1 replica identity full;
> ALTER TABLE
> postgres=# create publication pub1 for table t1(c1);
> CREATE PUBLICATION
>
> However, what is not allowed is the following:
> postgres=# delete from t1;
> ERROR:  cannot delete from table "t1"
> DETAIL:  Column list used by the publication does not cover the
> replica identity.
>
> I am not sure if we really need this line but if so then please try to
> make it more clear. I think the similar text is present in 0002 patch
> which should be modified accordingly.
>

The "Furthermore…" sentence came from the commit message [1]. But I
agree it seems redundant/ambiguous, so I have removed it (and removed
the same in patch 0002).


> V4-0002*
> =========
> 2.
> However, if a
> +   <firstterm>column list</firstterm> is specified then only the columns named
> +   in the list will be replicated. This means the subscriber-side table only
> +   needs to have those columns named by the column list. A user might choose to
> +   use column lists for behavioral, security or performance reasons.
> +  </para>
> +
> +  <sect2 id="logical-replication-col-list-rules">
> +   <title>Column List Rules</title>
> +
> +   <para>
> +    A column list is specified per table following the table name, and enclosed
> +    by parentheses. See <xref linkend="sql-createpublication"/> for details.
> +   </para>
> +
> +   <para>
> +    When a column list is specified, only the named columns are replicated.
> +    The list order is not important.
>
> It seems like "When a column list is specified, only the named columns
> are replicated." is almost a duplicate of the line in the first para.
> So, I think we can remove it. And if we do so then the second line
> could be changed to something like: "While specifying column list, the
> order of columns is not important."
>

Modified as suggested.

> 3. It seems information about initial table synchronization is
> missing. We copy only columns specified in the column list. Also, it
> would be good to add a Note similar to Row Filter to indicate that
> this list won't be used by pre-15 publishers.
>

Done as suggested. Added a new "Initial Data Synchronization" section
with content similar to that of the Row Filters section.

~~~

Thanks for your review comments.

PSA v5* patches where all the above have been addressed.

------
[1] https://github.com/postgres/postgres/commit/923def9a533a7d986acfb524139d8b9e5466d0a5

Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment

pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: [PATCH] Optimize json_lex_string by batching character copying
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: test_decoding assertion failure for the loss of top-sub transaction relationship