Re: Add an option to skip loading missing publication to avoid logical replication failure - Mailing list pgsql-hackers

From vignesh C
Subject Re: Add an option to skip loading missing publication to avoid logical replication failure
Date
Msg-id CALDaNm2gfsxQL7VrurNnJffCBcuyrZQ1LXDOhf9FY57n8=8cVw@mail.gmail.com
Whole thread Raw
In response to Re: Add an option to skip loading missing publication to avoid logical replication failure  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: Add an option to skip loading missing publication to avoid logical replication failure
List pgsql-hackers
On Thu, 13 Mar 2025 at 09:18, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> Thanks looks good to me.
>
> While looking at the patch, I have a few comments/questions
>
> + if (pub)
> + result = lappend(result, pub);
> + else
> + {
> + /*
> + * When executing 'ALTER SUBSCRIPTION ... SET PUBLICATION', the
> + * apply worker continues using the existing replication slot and
> + * origin after restarting. If the replication origin is not
> + * updated before the restart, the WAL start location may point to
> + * a position before the specified publication exists, causing
> + * persistent apply worker restarts and errors.
> + *
> + * This ensures that the publication is skipped if it does not
> + * exist and is loaded when the corresponding WAL record is
> + * encountered.
> + */
> + ereport(WARNING,
> + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("skipped loading publication: %s", pubname),
> + errhint("If the publication already exists, ignore it as it will be
> loaded upon reaching the corresponding WAL record; otherwise, create
> it."));
> + }
>
> This comment focuses on a specific use case regarding the problem with
> 'ALTER SUBSCRIPTION ... SET PUBLICATION,' but in reality, we are
> addressing a more general case where the user is trying to SET
> PUBLICATION or even CREATE SUBSCRIPTION, and some publications are
> missing. Wouldn't it be better to rephrase the comment?

How about a comment something like below:
/*
* In 'ALTER SUBSCRIPTION ... ADD/SET PUBLICATION' and
* 'CREATE SUBSCRIPTION', if the replication origin is not updated
* before the worker exits, the WAL start location might point to a
* position before the publication's WAL record. This can lead to
* persistent apply worker restarts and errors.
*
* Additionally, dropping a subscription's publication should not
* disrupt logical replication.
*
* This ensures that a missing publication is skipped and loaded
* when its corresponding WAL record is encountered.
*/
ereport(WARNING,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("skipped loading publication: %s", pubname),
errhint("If the publication is missing, create and refresh it.
Otherwise, wait for the slot to reach the WAL record, then refresh"));

> 2. + errhint("If the publication already exists, ignore it as it will
> be loaded upon reaching the corresponding WAL record; otherwise,
> create it."));
>
> Is this hint correct? This is a question rather than a comment: When
> we reach a particular WAL where the publication was created, will the
> publication automatically load, or does the user need to REFRESH the
> publications?

Users need to refresh the publication in case the relation is not
already added to pg_subscription_rel and apply incremental changes.
How about an error hint like:
"If the publication is missing, create and refresh it. Otherwise, wait
for the slot to reach the WAL record for the created publication, then
refresh"

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [PoC] Reducing planning time when tables have many partitions
Next
From: Peter Smith
Date:
Subject: Re: Commitfest Manager for March