Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers
| From | Masahiko Sawada |
|---|---|
| Subject | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
| Date | |
| Msg-id | CAD21AoAJXLDX_2Z=w2T-=e5pJ5LSeSZCSBfdPfwBwRayoZDQig@mail.gmail.com Whole thread Raw |
| In response to | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart (Amit Kapila <amit.kapila16@gmail.com>) |
| Responses |
Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
|
| List | pgsql-hackers |
On Thu, Nov 13, 2025 at 1:56 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Nov 12, 2025 at 3:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > I've attached the updated version patch. I addressed all comments I > > got so far, and made some cosmetic changes. > > > > Few comments: > 1. BTW, to accomplish what we do in > UpdateLogicalDecodingStatusEndOfRecovery(), can't we follow a logic > similar to what we do in EnsureLogicalDecodingEnabled()? > > Basically, enable xlog_logical_info and mark SharedRecoveryState = > RECOVERY_STATE_DONE under one spinlock. After the spinlock is > released, wait for other backends to reflect the xlog_logical_info via > WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_UPDATE_XLOG_LOGICAL_INFO)) > and then enable logical decoding by setting logical_decoding_enabled > and writing a new WAL record. > > Can't this avoid the special handling required in the patch for the > status_change_allowed flag? This flag seems to be primarily required > because recovery_done action and the change of wal_level action are > being performed separately during promotion. If it is feasible to > perform them together, then we don't need an additional state that > says logical decoding status_change_allowed. > > The reason I am trying to explore an alternative to the current way to > handle promotion is that it can eliminate one of the complex parts of > the patch. But, if that is not feasible then we can live with the > current approach. Thank you for the suggestion! How should it work in cases where we need to disable logical decoding at the end of recovery? We accept WAL writes as soon as we mark SharedRecoveryState = RECOVERY_STATE_DONE, so we need to disable logical decoding with a STATUS_CHANGE WAL write beforehand. And then we disable xlog_logical_info and update SharedRecoveryState under one spinlock. Probably we need to do these operations while status_change_inprogress being true to prevent concurrent logical decoding status updates, and clear the flag after recovery fully completes. While it seems to work, we need to verify the side-effects due to moving xlog_logical_info to XLogCtlData since every process needs to retrieve the flag at process startup time. After thinking of this idea more, I think we can simplify the idea more; we do all end-of-recovery operations (updating both xlog_logical_info and logical_decoding_eanbled, WAL writes, and synchronization) while status_change_inprogress being true, and after recovery fully completes we clear the inprogress flag. It's probably okay to perform these operations in any order since WAL writes are still not permitted, and we can prevent concurrent logical decoding status updates until we clear the inprogress flag, which happens after marking SharedRecoveryState = RECOVERY_STATE_DONE. I've drafted this idea. Please see the 0002 patch. > > 2. > We don't need to > + * wait for running transactions to finish as we don't accept any writes > + * yet. We need the wait even if we've not updated the status above as the > + * status have been turned on and off during recovery, having running > + * processes have different status on their local caches. > > In the first sentence, the comment says, we don't need to wait, and in > the second, it says we need the wait... Isn't it confusing? What is > the difference between both waits, if any? Can we improve this > comment? > > IIUC, this comment means that we should wait for all other backends to > reflect the new state for xlog_logical_info and > logical_decoding_enabled. But the way it is written makes it > confusing. Agreed, updated the comments. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
pgsql-hackers by date: