On Fri, Feb 23, 2024 at 4:45 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Fri, Feb 23, 2024 at 09:46:00AM +0000, Zhijie Hou (Fujitsu) wrote:
> >
> > Besides, I'd like to clarify and discuss the behavior of standby_slot_names once.
> >
> > As it stands in the patch, If the slots specified in standby_slot_names are
> > dropped or invalidated, the logical walsender will issue a WARNING and continue
> > to replicate the changes. Another option for this could be to have the
> > walsender pause until the slot in standby_slot_names is re-created or becomes
> > valid again. Does anyone else have an opinion on this matter ?
>
> Good point, I'd vote for: the only reasons not to wait are:
>
> - slots mentioned in standby_slot_names exist and valid and do catch up
> or
> - standby_slot_names is empty
>
> The reason is that setting standby_slot_names to a non empty value means that
> one wants the walsender to wait until the standby catchup. The way to remove this
> intentional behavior should be by changing the standby_slot_names value (not the
> existence or the state of the slot(s) it points too).
>
It seems we already do wait for the case when there is an inactive
slot as per the below code [1] in the patch. So, probably waiting in
other cases is also okay and also as this parameter is a SIGHUP
parameter, users should be easily able to change its value if
required. Do you think it is a good idea to mention this in docs as
well?
I think it is important to raise WARNING as the patch is doing in all
the cases where the slot is not being processed so that users can be
notified and they can take the required action.
[1] -
else if (XLogRecPtrIsInvalid(slot->data.restart_lsn) ||
+ slot->data.restart_lsn < wait_for_lsn)
+ {
+ bool inactive = (slot->active_pid == 0);
+
+ SpinLockRelease(&slot->mutex);
+
+ /* Log warning if no active_pid for this physical slot */
+ if (inactive)
+ ereport(WARNING,
+ errmsg("replication slot \"%s\" specified in parameter %s does not
have active_pid",
+ name, "standby_slot_names"),
+ errdetail("Logical replication is waiting on the standby associated
with \"%s\".",
+ name),
+ errhint("Consider starting standby associated with \"%s\" or amend
standby_slot_names.",
+ name));
+
+ /* Continue if the current slot hasn't caught up. */
+ continue;
--
With Regards,
Amit Kapila.