Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers
From | shveta malik |
---|---|
Subject | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Date | |
Msg-id | CAJpy0uCcqLirRUtfr6tHP37XsfckcNWfjtxfD1x+ZNVOTxn6kw@mail.gmail.com Whole thread Raw |
In response to | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
|
List | pgsql-hackers |
On Fri, Sep 26, 2025 at 12:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Sep 25, 2025 at 4:57 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Tue, Sep 23, 2025 at 3:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > I've attached the updated patch. It incorporates all comments I got so > > > far and implements to lazily disable logical decoding. It's used only > > > when the process tries to disable logical decoding during process > > > exit. > > > > > > > I am resuming the review now. I agree with the discussion of lazily > > disabling logical decoding on ERROR or process-exit for temp-slot. > > > > Few initial comments: > > Thank you for the comments! > > > > > 1) > > I see that on standby too, during proc-exit, we set 'pending_disable'. > > But it never resets it, as DisableLogicalDecodingIfNecessary is no-op > > on standby. And thus the checkpoint keeps on attempting to reset it > > everytime. Do we even need to set it on standby? > > > > Logfile has repeated: 'start completing pending logical decoding > > disable request' > > Ugh, I missed that part. I think that standbys should not delegate the > deactivation to the checkpointer uless the deactivation is actually > required. > > > 2) > > + ereport(LOG, > > + (errmsg("skip disabling logical decoding as during process exit"))); > > > > 'as' not needed. > > I've fixed the above two points and attached the new version patch. > Thanks. 1) Currently, in the existing implementation, if a promotion is in progress (delay_status_change = true) and, during that time, a process exits (causing a temporary slot to be released), then on the standby, we may end up setting pending_disable. As a result, the checkpointer will have to wait for the transition to complete before it can proceed with disabling logical decoding (if needed). a) This means the checkpoint may be delayed further, depending on how long it takes for all processes to respond to ProcSignalBarrier(). b) Additionally, consider the case where the promotion fails midway (after UpdateLogicalDecodingStatusEndOfRecovery). If the checkpointer still sees RecoveryInProgress and delay_status_change as true, could it end up waiting indefinitely for the transition to complete? In my testing, when promotion fails and the startup process exits, it usually causes the rest of the processes, including the checkpointer, to terminate as well. So, it seems that a dangling pending_disable state may not actually occur on standby in practice. I believe scenario (b) can't really happen, but I still wanted to check with you. I am not sure if (a) is a real concern — what’s your take on it? 2) As per discussion in [1], there was a proposal to implement lazily disabling decoding both in ERROR and proc-exit scenarios. But I see it only implemented in proc-exit scenario. Are we planning to do it for ERROR as well? [1]: https://www.postgresql.org/message-id/CAA4eK1JVNbb-OT1PO%3DiOFG1KA__Q83n8cLZoDjF2yA1rZyvCnA%40mail.gmail.com thanks Shveta
pgsql-hackers by date: