Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers
From | shveta malik |
---|---|
Subject | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Date | |
Msg-id | CAJpy0uDYUrRVAD0FZcFdv5BRGRGRaLLOhsT3wsWVn=EsQ9YfKQ@mail.gmail.com Whole thread Raw |
In response to | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
|
List | pgsql-hackers |
On Fri, Oct 17, 2025 at 11:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Oct 17, 2025 at 5:07 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Fri, Oct 17, 2025 at 12:47 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Thu, Oct 16, 2025 at 9:07 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > On Fri, Oct 17, 2025 at 8:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > On Thu, Oct 16, 2025 at 11:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > > > On Thu, Oct 16, 2025 at 1:41 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > Using PMSIGNAL_BACKGROUND_WORKER_CHANGE sounds mis-using since the > > > > > > slotsync worker is not a background worker nor logical decoding > > > > > > activation is not related to bgworkers. > > > > > > > > > > > > An alternative idea is to launch the slotsync worker if wal_level > > > > > > value on the standby is >=replica, that is, always launch it on the > > > > > > standby if sync_replication_slots is on. Even with and without the > > > > > > patch, we don't shutdown the slotsync worker even if logical decoding > > > > > > gets disabled on the standby. > > > > > > > > > > > > > > > > Are you talking about the case when wal_level on primary has reduced > > > > > below logical and user will get the following message on standby: > > > > > "logical decoding on standby requires \"wal_level\" >= \"logical\" on > > > > > the primary"? If so, the slight difference in this case is that > > > > > standby still has wal_level logical. > > > > > > > > > > > > > I believe what Sawada-san meant is that even when effective_wal_level > > > > = replica on a standby, we should still allow the slot-sync worker to > > > > start if 'sync_replication_slots' is enabled. This is because we > > > > currently do not stop the worker when effective_wal_level on the > > > > standby changes from logical to replica, so allowing it to start in > > > > this case maintains consistent behavior. That said, my preference is > > > > to not start the slot-sync worker if effective_wal_level is less than > > > > logical. As I understand, this is already the behavior implemented in > > > > the current patch. > > > > > > Exactly. Thank you for clarifying my comment. > > > > > > > > > > > Regarding the scenario where effective_wal_level changes from logical > > > > to replica on a standby, my vote is to explicitly shut down the > > > > slot-sync worker in such cases. I don't see any benefit in keeping it > > > > running. But this can be handled in a separate patch as it is not > > > > directly concerned with this patch. > > > > > > If the last logical slot on the primary is a failover slot, > > > STATUS_CHANGE with logical_decoding=false could reach the standby > > > before the slotsync worker drops the corresponding slot. In this case, > > > if we shutdown the slotsync upon replaying that WAL record, the synced > > > (and invalidated) slot could remain. It might be one potential benefit > > > that we keep the slotsync worker running even when wal_level='replica' > > > (at least until one more synchronization cycle is done). > > > > > > > Okay, I will think more on this. > > > > > > > > > > Next is, when effective_wal_level changes from replica to logical, > > > > should we wake up the postmaster to immediately start the slot-sync > > > > worker? My vote is yes, but if implementing this introduces too much > > > > complexity, especially considering it's a rare scenario, we could > > > > leave it as is. In that case, the slot-sync worker would still start, > > > > but possibly with a delay of up to 1-2 minutes when the postmaster is > > > > sleeping. > > > > > > After checking other codes, I found that we simply send SIGUSR1 to the > > > postmaster in pg_promote(). I think we can use it. > > > > > > > Okay. > > I've attached the updated patches. > > Regards, > Thank You for the patch. Slotsync worker behaviour seems to be fixed now. I want to discuss the create-publication case, which currently gives this warning: postgres=# create publication pub1 for all tables; WARNING: logical decoding should be allowed to publish logical changes HINT: Before creating subscriptions, set "wal_level" >= "logical" or create a logical replication slot when "wal_level" = "replica". CREATE PUBLICATION But is this warning really necessary during publication creation? There are two scenarios: a) When a subscription is created with create_slot=true: In this case, logical decoding is enabled automatically on the publication, allowing the subscription to run smoothly. Without the patch, the CREATE SUBSCRIPTION used to fail with: postgres=# CREATE subscription sub1 connection '...3' publication pub1; ERROR: could not create replication slot "sub1": ERROR: logical decoding requires "wal_level" >= "logical" b) When a subscription is created with create_slot=false: Here, the logical replication worker fails to start. With patch: LOG: logical replication apply worker for subscription "sub1" has started ERROR: could not start WAL streaming: ERROR: replication slot "sub1" does not exist Without patch: LOG: logical replication apply worker for subscription "sub1" has started ERROR: could not start WAL streaming: ERROR: logical decoding requires "wal_level" >= "logical" The difference in the error is due to change in CheckLogicalDecodingRequirements(). So if we modify case b to display the same error as prior to patch, then the WARNING during create-pub is justified. Otherwise, if we retain error of case b as such (which also seems fine to me, assuming slot-creation will automatically enable logical decoding), then IMO, WARNING during create-pub should also be removed. Thoughts? thanks Shveta
pgsql-hackers by date: