Home > mailing lists

Re: Add an option to skip loading missing publication to avoid logical replication failure - Mailing list pgsql-hackers

From	Xuneng Zhou
Subject	Re: Add an option to skip loading missing publication to avoid logical replication failure
Date	May 2 13:44:31
Msg-id	CABPTF7XH8Uh+K-x3RMt6fOkK3xwSD2YVQehCfp_hb1TS0abe+w@mail.gmail.com Whole thread Raw
In response to	Re: Add an option to skip loading missing publication to avoid logical replication failure (vignesh C <vignesh21@gmail.com>)
List	pgsql-hackers

Tree view

Yeh, tks for your clarification. I have a basic understanding of it now. I mean is this considered a bug or design defect in the codebase? If so, should we prevent it from occuring in general, not just for this specific test.

vignesh C <vignesh21@gmail.com>

We have three processes involved in this scenario:
A walsender process on the publisher, responsible for decoding and
sending WAL changes.
An apply worker process on the subscriber, which applies the changes.
A session executing the ALTER SUBSCRIPTION command.

Due to the asynchronous nature of these processes, the ALTER
SUBSCRIPTION command may not be immediately observed by the apply
worker. Meanwhile, the walsender may process and decode an INSERT
statement.
If the insert targets a table (e.g., tab_3) that does not belong to
the current publication (pub1), the walsender silently skips
replicating the record and advances its decoding position. This
position is sent in a keepalive message to the subscriber, and since
there are no pending transactions to flush, the apply worker reports
it as the latest received LSN.
Later, when the apply worker eventually detects the subscription
change, it restarts—but by then, the insert has already been skipped
and is no longer eligible for replay, as the table was not part of the
publication (pub1) at the time of decoding.
This race condition arises because the three processes run
independently and may progress at different speeds due to CPU
scheduling or system load.
Thoughts?

Regards,
Vignesh

pgsql-hackers by date:

From: shveta malik
Date: 02 May, 12:35:15
Subject: Re: Fix slot synchronization with two_phase decoding enabled

From: Robert Haas
Date: 02 May, 15:04:42
Subject: Re: fixing CREATEROLE

Re: Add an option to skip loading missing publication to avoid logical replication failure - Mailing list pgsql-hackers

Previous

Next