Re: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication. - Mailing list pgsql-bugs

From Dilip Kumar
Subject Re: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.
Date
Msg-id CAFiTN-t4a4PMW0RAG+HXbZxVVAkXoni196mjYGuJZS+Pjgqz8g@mail.gmail.com
Whole thread Raw
In response to BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.  (PG Bug reporting form <noreply@postgresql.org>)
List pgsql-bugs
On Wed, Jan 3, 2024 at 9:51 PM PG Bug reporting form
<noreply@postgresql.org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference:      18267
> Logged by:          song yutao
> Email address:      sytoptimisprime@163.com
> PostgreSQL version: 15.5
> Operating system:   Linux
> Description:
>
> Hi hackers, I found when insert plenty of data into a table, and add the
> table to publication (through Alter Publication) meanwhile, it's likely that
> the incremental data cannot be synchronized to the subscriber. Here is my
> test method:
>
> 1. On publisher and subscriber, create table for test:
> CREATE TABLE tab_1 (a int);
>
> 2. Setup logical replication:
> on publisher:
>      SELECT pg_create_logical_replication_slot('slot1', 'pgoutput', false,
> false);
>                        CREATE PUBLICATION tap_pub;
> on subscriber:
>      CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION
> tap_pub WITH (enabled = true, create_slot = false, slot_name='slot1')
>
> 3. Perform Insert:
>      for (my $i = 1; $i <= 1000; $i++) {
>          $node_publisher->safe_psql('postgres', "INSERT INTO tab_1 SELECT
> generate_series(1, 1000)");
>      }
>      Each transaction contains 1000 insertion, and 1000 transactions are in
> total.
>
> 4. When performing step 3, add table tab_1  to publication.
>      ALTER PUBLICATION tap_pub ADD TABLE tab_1
>      ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION

I could not reproduce this issue.  Can you tell me exactly which data
were missing for you?  When you add a table to the publication and
refresh, and as soon as you identify that the table is part of the
publication and send the first commit which contains the changes for
the table it will identify that the table state is not yet SYNC READY
and then it will trigger a sync worker and via that it should be able
to get all the previous data for that table.

> The root cause of the problem is as follows:
> pgoutput relies on the invalidation mechanism to validate publications. When
> walsender decoding an Alter Publication transaction, catalog caches are
> invalidated at once. Furthermore, since pg_publication_rel is modified,
> snapshot changes are added to all transactions currently being decoded. For
> other transactions, catalog caches have been invalidated. However, it is
> likely that the snapshot changes have not yet been decoded. In pgoutput
> implementation, these transactions query the system table pg_publication_rel
> to determine whether to publish changes made in transactions. In this case,
> catalog tuples are not found because snapshot has not been updated. As a
> result, changes in transactions are considered not to be published, and
> subsequent data cannot be synchronized.
>
> I think it's necessary to add invalidations to other transactions after
> adding a snapshot change to them.
> Therefore, I submitted a patch for this bug.

I think you missed attaching the patch, as Amit also pointed out.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-bugs by date:

Previous
From: David Rowley
Date:
Subject: Re: BUG #18264: Table has type text, but query expects integer.attribute 1 of type record has wrong type
Next
From: David Rowley
Date:
Subject: Re: Postgres 16.1 - Bug: cache entry already complete