Re: Add an option to skip loading missing publication to avoid logical replication failure - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Add an option to skip loading missing publication to avoid logical replication failure
Date
Msg-id CAFiTN-vt6y2VrhA0aHiR-DxbEtESB5uMX4xSZFZnsDpOvLwdRQ@mail.gmail.com
Whole thread Raw
In response to Re: Add an option to skip loading missing publication to avoid logical replication failure  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Add an option to skip loading missing publication to avoid logical replication failure
List pgsql-hackers
On Mon, Mar 10, 2025 at 9:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Mar 4, 2025 at 6:54 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On further thinking, I felt the use of publications_updated variable
> is not required we can use publications_valid itself which will be set
> if the publication system table is invalidated. Here is a patch for
> the same.
>

The patch relies on the fact that whenever a publication's data is
invalidated, it will also invalidate all the RelSyncEntires as per
publication_invalidation_cb. But note that we are discussing removing
that inefficiency in the thread  [1]. So, we should try to rebuild the
entry when we have skipped the required publication previously.

Apart from this, please consider updating the docs, as mentioned in my
response to Sawada-San's email.

I'm not sure I fully understand it, but based on your previous email and the initial email from Vignesh, if IIUC, the issue occurs when a publication is created after a certain LSN. When ALTER SUBSCRIPTION ... SET PUBLICATION is executed, the subscriber workers restart and request the changes based on restart_lsn, which is at an earlier LSN in the WAL than the LSN at which the publication was created. This leads to an error, and we are addressing this behavior as part of the fix by skipping the changes which are between the restart_lsn of subscriber and the lsn at which publication is created and this behavior looks fine. 

BTW, I am planning to commit this only on HEAD as this is a behavior
change. Please let me know if you guys think otherwise.

Somehow this looks like a bug fix which should be backported no?  Am I missing something?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: maintenance_work_mem = 64kB doesn't work for vacuum
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Selectively invalidate caches in pgoutput module