Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Date | |
Msg-id | CAA4eK1JeRrZESnD7qrAErss33tBLM=rbmycCAp52c066at6dxw@mail.gmail.com Whole thread Raw |
In response to | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart (Masahiko Sawada <sawada.mshk@gmail.com>) |
List | pgsql-hackers |
On Fri, Oct 3, 2025 at 12:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Wed, Oct 1, 2025 at 10:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > The other point to consider is that during promotion after > > UpdateLogicalDecodingStatusEndOfRecovery(), we have multiple things > > that seems to be necessary to perform before backends are allowed to > > write. For example, refer to comments: "If any of the critical GUCs > > have changed, log them before we allow backends to write WAL.*/. I > > think the key thing is that before we set state DB_IN_PRODUCTION in > > ControlFile and mark SharedRecoverstate as RECOVERY_STATE_DONE, > > backends shouldn't be allowed to write WAL. If we want to take an > > exception for writing a WAL during slot_creation before the > > RECOVERY_STATE_DONE is set, we should analyze and explain in comments > > why it is okay to take this exception. > > Agreed. > > As the discussion is becoming more complex, let me summarize our > discussion about the delay_status_change flag and lazy behavior. > Thanks for the summary. > The delay_status_change flag was created to handle a specific timing > issue: there's a brief window where backend processes can > enable/disable logical decoding but cannot write the STATUS_CHANGE > record. This occurs because after the startup process updates the > logical decoding status (in > UpdateLogicalDecodingStatusEndOfRecovery()), backend processes cannot > write WAL records until the startup sets SharedRecoveryState to > RECOVERY_STATE_DONE. The idea is to delay any logical decoding status > changes during this window until WAL writing is permitted system-wide. > An alternative idea being discussed is to allow an exception for > STATUS_CHANGE records, letting them be written even during this > window. While this alternative is simpler and technically feasible, it > could be risky as it breaks the general rule that 'backends cannot > write WAL records until recovery completes.' > > When the process exits or raises an ERROR, the process needs to clean > up temporary and ephemeral slots, which might disable logical > decoding. This deactivation process may involve waiting - either for > concurrent activation/deactivation processes to finish or due to the > delay_status_flag (if implemented). However, waiting during user-level > cleanup (in before_shmem_exit callbacks) isn't ideal since the process > blocks all interrupts. To address this, we introduced lazy behavior, > which delegates the deactivation process to the checkpointer, allowing > it to disable logical decoding asynchronously. This way, the > deactivation during user-level cleanup only needs to disable logical > decoding in shared memory and send signals. > > While we've discussed that if we don't use the idea of the > delay_status_flag we don't need the lazy behavior either, I find that > we still need lazy behavior to handle waits during concurrent status > changes. Moreover, since we need lazy behavior anyway, the benefits of > implementing the exception-based approach seem limited. > Right, it would be ideal if we can follow the same idea (either in a lazy way or by introducing wait during drop of slot) to disable decoding in all cases. However, during user operations (say slot_drop or subscription drop), doing it lazily has a downside that it can take some time before we could change the effective_wal_level and disable decoding especially when checkpointer (or any other background process we choose to do this work) is already busy with other things. Ideally, it shouldn't be frequent to drop the last logical slot, so it should be okay to take such an exception to keep code simple. As it is difficult to predict all kinds of use cases, I think we can keep a Note in the code that we can improve it by waiting during drop for most cases, if there is a workload that is impacted by lazy disabling of the decoding. -- With Regards, Amit Kapila.
pgsql-hackers by date: