Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date
Msg-id CAD21AoD5aONyxZHGG5-gQhQnAMuF9dByLn0+treF8cRT06bqkA@mail.gmail.com
Whole thread Raw
In response to Re: POC: enable logical decoding when wal_level = 'replica' without a server restart  (shveta malik <shveta.malik@gmail.com>)
List pgsql-hackers
On Tue, Jul 15, 2025 at 10:55 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Jul 15, 2025 at 10:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached updated patches that implement the idea we've discussed.
> > The patches still need to be polished but the implemented ideas seem
> > good. Feedback is very welcome.
> >
>
> Thank You for the patches. I just tried my hands on ptach001 yet, few concerns:
>
> 1)
> + else if (xlrec.wal_level == WAL_LEVEL_REPLICA &&
> + pg_atomic_read_u32(&ReplicationSlotCtl->n_inuse_logical_slots) == 0)
> + {
> + /*
> + * Disable the logical decoding if there is no in-use logical slot
> + * on the standby.
> + */
> + UpdateLogicalDecodingStatus(false);
> + }
>
> Due to above logic, the change in wal_level to replica on primary may
> end up disabling logical decoding on standby, even if logical decoding
> is still enabled on primary due to existence of slot.
>
> Steps:
> a) Create a slot on primary, but no slots on standby.
> b) Switch wal_level to logical on primary by doing a restart.
> c) Now switch wal_level back to replica on primary. This will end up
> disabling logical decoding on standby and slot creation will fail on
> standby as well.
>
> 2)
> In the same code, why don't we invalidate slots as we do when we
> receive XLOG_LOGICAL_DECODING_STATUS_CHANGE?

Good catch. I think decreasing wal_level to 'replica' should not
directly involve logical decoding status.

Related to this issue, I've considered the possibility of getting rid
of 'logical' from wal_level. Given the effective WAL level is
increased and decreased automatically upon the slot creation and
deletion, I think we would be able to get rid of 'logical' from
wal_level. One scenario where users would need to take additional
action is that users offload logical replication to the standby
server. In this case, the user would have to enable the logical
decoding on the primary server before creating a logical slot on the
standby. If such additional work is acceptable, we can remove it, and
I think it would be reasonable.


> 3)
> + EnsureLogicalDecodingEnabled();
>
> I do not understand the usage of above in synchronize_one_slot().
> Since 'EnsureLogicalDecodingEnabled' is a no-op for standby, it will
> do nothing here.

Right, will remove it.

>
> 4)
> - if (wal_level < WAL_LEVEL_LOGICAL)
> - ereport(ERROR,
> - errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> - errmsg("replication slot synchronization requires \"wal_level\" >=
> \"logical\""));
> + if (!IsLogicalDecodingEnabled())
> + ereport(elevel,
> + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> +
>
> Is the change from 'ERROR' to  'elevel' intentional? With this change,
> slotsync worker will keep running even if logical decoding is not
> enabled on standby (or primary) yet.

Yes, this is because this function is called by the postmaster. But I
can see your point so I will deal with it in the next version patch. I
think the slotsync worker needs to exit when the logical decoding gets
disabled.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: Fix PQport to never return NULL if the connection is valid
Next
From: Masahiko Sawada
Date:
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart