Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers

From shveta malik
Subject Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date
Msg-id CAJpy0uAr=RfOPwAGavwhzy+v_-dCs=7B3dOXXJsBFKDy-_rLFw@mail.gmail.com
Whole thread Raw
In response to Re: POC: enable logical decoding when wal_level = 'replica' without a server restart  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
List pgsql-hackers
On Fri, Dec 5, 2025 at 12:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Dec 4, 2025 at 1:30 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Thu, Dec 4, 2025 at 1:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > >
> > > I've attached the updated patch that incorporated the review comments
> > > and is rebased to the current HEAD.
> > >
> >
> > Thanks, a few things:
> >
> >
> > 1)
> > +        The system automatically increases the
> > +        effective WAL level to <literal>logical</literal> when
> > creating the first
> > +        logical replication slot, and decreases it back to
> > <literal>replica</literal>
> > +        when dropping the last logical replication slot. The current
> > effective WAL
> > +        level can be monitored through <xref
> > linkend="guc-effective-wal-level"/>
> > +        parameter.
> >
> > Shouldn't we mention about invalidation of slot along with dropping of
> > slot? I have suggested this earlier but I think it got missed to be
> > addressed.
>
> Fixed. Sorry I missed it.
>
> >
> > 2)
> > + else if (sync_replication_slots)
> > + {
> > + /*
> > + * Signal the postmaster to launch the slotsync worker.
> > + *
> > + * XXX: For simplicity, we keep the slotsync worker running
> > + * even after logical decoding is disabled. A future
> > + * improvement can consider starting and stopping the worker
> > + * based on logical decoding status change.
> > + */
> > + kill(PostmasterPid, SIGUSR1);
> > + }
> > + }
> > +
> > + /* Update the status on shared memory */
> > + UpdateLogicalDecodingStatus(logical_decoding, true);
> >
> > I see that we have moved 'Update' post slotsync's start attempt. This
> > leaves a possibility that that slot-sync is started sooner than last
> > 'Update' call and thus slotsync may exit with:
> > replication slot synchronization requires "effective_wal_level" >=
> > "logical" on the primary
> >
> > I see that update was prior to the slotsync step in earlier patches.
> > Why have we moved it to a later stage?
>
> You're right. I've reconsidered the operation order and reverted the
> previous change. On further thought, I think it should not happen to
> decode the STATUS_CHANGE record because when applying the
> STATUS_CHANGE record, we disable logical decoding first and then
> invalidate the slots. There is no window where a logical slot can
> creep in on the standby, and on the primary it cannot restart after
> changing wal_level < replica if there is a pre-existing slot. If my
> understanding is right, we should put a sanity check in xlog_decode().
>
> >
> > 3)
> > + /* Return if someone already started to enable logical decoding */
> >
> > Shall we update:
> > Return if someone already started to enable logical decoding, or if it
> > is already enabled.
>
> Fixed.
>
> I've attached the updated patches. The 0002 patch is a patch to
> address the above review comments. If it looks good, I'll merge these
> changes.
>

Thanks, I will review.

While testing one of the scenarios on v33 patch, I observed a strange
behaviour. I could not reproduce it again. I am not sure if it is due
to the patch. Please have a look once.
The scenario is:

pub: create a slot to enable logical decoding, create table, insert few records
standby: create slot and try to get-changes, get-changes hit some error:

postgres=# SELECT pg_create_logical_replication_slot('slot13',
'pgoutput', false, false, false);
 pg_create_logical_replication_slot
------------------------------------
 (slot13,0/0306D448)
(1 row)

postgres=# select pg_logical_slot_get_binary_changes('slot13', NULL,
NULL, 'proto_version', '4', 'publication_names', 'pub');
ERROR:  cannot query non-catalog table "pg_class" during logical decoding

thanks
Shveta



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Proposal: Conflict log history table for Logical Replication
Next
From: Shinya Kato
Date:
Subject: Re: Add mode column to pg_stat_progress_vacuum