Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers

From shveta malik
Subject Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date
Msg-id CAJpy0uCQ60e93+7d+sDObSG426uv5Ohy+sJoxr-z=shVM+EHwg@mail.gmail.com
Whole thread Raw
In response to Re: POC: enable logical decoding when wal_level = 'replica' without a server restart  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
List pgsql-hackers
On Wed, Nov 19, 2025 at 11:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Nov 19, 2025 at 1:21 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Nov 19, 2025 at 1:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Nov 17, 2025 at 10:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Tue, Nov 18, 2025 at 1:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Sun, Nov 16, 2025 at 9:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > >
> > > > > > In v26-0002-FIXUP-remove-status_change_allowed-flag, by using
> > > > > > status_change_inprogress, we ensure that no backend is allowed to
> > > > > > toggle the logical_wal/decoding status till startup process marks the
> > > > > > recovery state as recovery_done. I am trying to think what problem
> > > > > > this part of design prevents. I have considered the following
> > > > > > scenarios:
> > > > > >
> > > > > > Scenario-1:
> > > > > > 1. Startup process enables logical_wal and logical_decoding. Writes
> > > > > > WAL record for it
> > > > > > 2. Backend disables logical_decoding, writes WAL for it, and disables
> > > > > > logical_wal.
> > > > > > 3. Startup process sets recovery_done and allows wal_writes
> > > > > >
> > > > > > Say, instead of using status_change_inprogress to prevent doing
> > > > > > step-2, if we had used recovery_in_progress kind of flag then how is
> > > > > > it possible for backends to create any problem for the current node or
> > > > > > cascaded standbys? I think the only way a problem can happen is if we
> > > > > > write the WAL to disable_logical decoding after any backend could have
> > > > > > written a non-logical WAL information record. Can that happen if we
> > > > > > use the recovery_in_progress flag to prevent disable of logical_wal?
> > > > > > If so, how?
> > > > >
> > > > > The main idea of holding status_change_inprogress until the recovery
> > > > > end is to prevent concurrent toggling the logical decoding status. In
> > > > > your scenario, IIUC backends cannot write any WAL yet at step-2 since
> > > > > it's allowed at step-3. It would end up with a FATAL error actually.
> > > > > One alternative is to make processes call LocalSetXLogInsertAllowed()
> > > > > so that they can write WAL even during recovery, but I don't use it as
> > > > > I'm concerned that it could lead to other problems. On the other hand,
> > > > > we cannot let the backend to disable logical_decoding and logical_wal
> > > > > without WAL warite at step-2 because otherwise the cascaded standby
> > > > > won't disable logical decoding.
> > > > >
> > > >
> > > > Why can't we postpone disabling logical WAL, decoding to the next
> > > > cycle of checkpointer when RecoveryInProgress() is true without
> > > > relying on status_change_inprogress? So, this will lead to a window
> > > > where there are no logical slots but still the effective_wal_level is
> > > > logical. However, this could be true even without considering this
> > > > problem because the checkpointer can take some time to disable the
> > > > logical WAL and decoding.
> > > >
> > > > The other problematic case to consider is during promotion, the
> > > > startup has marked logical decoding as disabled but not yet marked
> > > > recovery-done. Then the backend created a slot and returned without
> > > > marking logical decoding as enabled due to relying on
> > > > RecoveryInProgress(). Then the start-up marked Recovery-Done. Now we
> > > > have a logical slot present, but logical decoding is disabled. I think
> > > > we can simply disallow the creation of a logical slot in this window
> > > > (where effective_wal_level is 'replica' and RecoveryInProgress() is
> > > > true).
> > >
> > > It sounds reasonable. Backends are already prohibited from creating
> > > logical slots when effective_wal_level is 'replica' and
> > > RecoveryInProgress() is true, so it should not be a problem.
> > >
> > > > If the above is feasible and sounds reasonable, then we don't even
> > > > need the status_change_inprogress flag, at least not during the
> > > > start-up flow.
> > >
> > > I've updated the patch based on the above suggestion. I believe we
> > > still need the status_change_inprogress flag when not in recovery but
> > > in the new version I don't use the flag during end-of-recovery action.
> > >
> >
> > Thanks for the patch.
> >
> > I had a look at the new changes. It appears that the startup process
> > now marks the state as pending_disable rather than directly disabling
> > logical decoding itself. In this setup, I think the situation Amit
> > described will no longer occur— the case where the startup process
> > disables logical decoding, and during the brief window before
> > recovery-done is set, a slot-creation attempt is issued and gets
> > blocked.
> >
> > Currently, in the scenario where the primary has one slot and the
> > standby has zero slots, if I promote the standby and then try to
> > create another slot in parallel (after
> > UpdateLogicalDecodingStatusEndOfRecovery but before marking
> > RECOVERY_STATE_DONE), the slot creation proceeds (instead of being
> > blocked) and eventually hangs in DecodingContextFindStartpoint()—it
> > sleeps in read_local_xlog_page_guts() until the promotion completes.
> >
> > IIUC, Amit’s original idea was to disable logical decoding within the
> > startup process itself rather than doing it lazily and block the
> > slot-creation in parallel.
>
> Indeed. I misunderstood the idea. The attached new version patch
> should implement the idea correctly.
>

Thanks. I will review.

On reconsideration, it looks like the 'status_change_inprogress' flag
can be removed entirely.

The scenario where EnsureLogicalDecodingEnabled() is trying to enable
logical decoding while DisableLogicalDecodingIfNecessary() is
attempting to disable it cannot occur in practice. This is because a
drop-logical-slot operation (for the only slot present) would fail in
parallel if the slot is already held by a backend in
EnsureLogicalDecodingEnabled().

Considering the reverse scenario where
DisableLogicalDecodingIfNecessary() is disabling logical decoding
while EnsureLogicalDecodingEnabled() is trying to enable it — the
disable flow involves the following steps:
--Check for the existence of logical slots.
--Disable system-wide logical decoding.
--Write the WAL.
--Disable xlog_logical_info.
--Emit a PROCSIGNAL_BARRIER for other processes to observe the new
value.

It seems feasible to make the first four steps atomic by only
releasing the LogicalDecodingControlLock after step 4. In this case,
even if a slot creation occurs in parallel between the lock release
(after step 4) and EmitProcSignalBarrier(), there would be no risk of
ending up with incorrect values. Thoughts?

thanks
Shveta



pgsql-hackers by date:

Previous
From: Akshay Joshi
Date:
Subject: Re: [PATCH] Add pg_get_database_ddl() function to reconstruct CREATE DATABASE statement
Next
From: Akshay Joshi
Date:
Subject: Re: [PATCH] Add pg_get_policy_ddl() function to reconstruct CREATE POLICY statement