Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers

From shveta malik
Subject Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date
Msg-id CAJpy0uDYUrRVAD0FZcFdv5BRGRGRaLLOhsT3wsWVn=EsQ9YfKQ@mail.gmail.com
Whole thread Raw
In response to Re: POC: enable logical decoding when wal_level = 'replica' without a server restart  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
List pgsql-hackers
On Fri, Oct 17, 2025 at 11:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Oct 17, 2025 at 5:07 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Fri, Oct 17, 2025 at 12:47 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Oct 16, 2025 at 9:07 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Fri, Oct 17, 2025 at 8:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Thu, Oct 16, 2025 at 11:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > On Thu, Oct 16, 2025 at 1:41 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > >
> > > > > >
> > > > > > Using PMSIGNAL_BACKGROUND_WORKER_CHANGE sounds mis-using since the
> > > > > > slotsync worker is not a background worker nor logical decoding
> > > > > > activation is not related to bgworkers.
> > > > > >
> > > > > > An alternative idea is to launch the slotsync worker if wal_level
> > > > > > value on the standby is >=replica, that is, always launch it on the
> > > > > > standby if sync_replication_slots is on. Even with and without the
> > > > > > patch, we don't shutdown the slotsync worker even if logical decoding
> > > > > > gets disabled on the standby.
> > > > > >
> > > > >
> > > > > Are you talking about the case when wal_level on primary has reduced
> > > > > below logical and user will get the following message on standby:
> > > > > "logical decoding on standby requires \"wal_level\" >= \"logical\" on
> > > > > the primary"? If so, the slight difference in this case is that
> > > > > standby still has wal_level logical.
> > > > >
> > > >
> > > > I believe what Sawada-san meant is that even when effective_wal_level
> > > > = replica on a standby, we should still allow the slot-sync worker to
> > > > start if 'sync_replication_slots' is enabled. This is because we
> > > > currently do not stop the worker when effective_wal_level on the
> > > > standby changes from logical to replica, so allowing it to start in
> > > > this case maintains consistent behavior. That said, my preference is
> > > > to not start the slot-sync worker if effective_wal_level is less than
> > > > logical. As I understand, this is already the behavior implemented in
> > > > the current patch.
> > >
> > > Exactly. Thank you for clarifying my comment.
> > >
> > > >
> > > > Regarding the scenario where effective_wal_level changes from logical
> > > > to replica on a standby, my vote is to explicitly shut down the
> > > > slot-sync worker in such cases. I don't see any benefit in keeping it
> > > > running. But this can be handled in a separate patch as it is not
> > > > directly concerned with this patch.
> > >
> > > If the last logical slot on the primary is a failover slot,
> > > STATUS_CHANGE with logical_decoding=false could reach the standby
> > > before the slotsync worker drops the corresponding slot. In this case,
> > > if we shutdown the slotsync upon replaying that WAL record, the synced
> > > (and invalidated) slot could remain. It might be one potential benefit
> > > that we keep the slotsync worker running even when wal_level='replica'
> > > (at least until one more synchronization cycle is done).
> > >
> >
> > Okay, I will think more on this.
> >
> > > >
> > > > Next is, when effective_wal_level changes from replica to logical,
> > > > should we wake up the postmaster to immediately start the slot-sync
> > > > worker? My vote is yes, but if implementing this introduces too much
> > > > complexity, especially considering it's a rare scenario, we could
> > > > leave it as is. In that case, the slot-sync worker would still start,
> > > > but possibly with a delay of up to 1-2 minutes when the postmaster is
> > > > sleeping.
> > >
> > > After checking other codes, I found that we simply send SIGUSR1 to the
> > > postmaster in pg_promote(). I think we can use it.
> > >
> >
> > Okay.
>
> I've attached the updated patches.
>
> Regards,
>

Thank You for the patch. Slotsync worker behaviour seems to be fixed now.

I want to discuss the create-publication case, which currently gives
this warning:

postgres=# create publication pub1 for all tables;
WARNING:  logical decoding should be allowed to publish logical changes
HINT:  Before creating subscriptions, set "wal_level" >= "logical" or
create a logical replication slot when "wal_level" = "replica".
CREATE PUBLICATION

But is this warning really necessary during publication creation?

There are two scenarios:

a) When a subscription is created with create_slot=true: In this case,
logical decoding is enabled automatically on the publication, allowing
the subscription to run smoothly. Without the patch, the CREATE
SUBSCRIPTION used to fail with:
postgres=# CREATE subscription sub1 connection '...3' publication pub1;
ERROR:  could not create replication slot "sub1": ERROR:  logical
decoding requires "wal_level" >= "logical"

b) When a subscription is created with create_slot=false: Here, the
logical replication worker fails to start.

With patch:
LOG:  logical replication apply worker for subscription "sub1" has started
ERROR:  could not start WAL streaming: ERROR:  replication slot "sub1"
does not exist

Without patch:
LOG:  logical replication apply worker for subscription "sub1" has started
ERROR:  could not start WAL streaming: ERROR:  logical decoding
requires "wal_level" >= "logical"

The difference in the error is due to change in
CheckLogicalDecodingRequirements().

So if we modify case b to display the same error as prior to patch,
then the WARNING during create-pub is justified. Otherwise, if we
retain error of case b as such (which also seems fine to me, assuming
slot-creation will automatically enable logical decoding), then IMO,
WARNING during create-pub should also be removed. Thoughts?

thanks
Shveta



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Client-only Meson Build From Sources
Next
From: Jelte Fennema-Nio
Date:
Subject: Re: CI: Add task that runs pgindent