Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date
Msg-id CAA4eK1LUmrNWa=JGdedzKD=gM3wq0ZsfKREQu1G8NqqWCpZPgA@mail.gmail.com
Whole thread Raw
In response to Re: POC: enable logical decoding when wal_level = 'replica' without a server restart  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
List pgsql-hackers
On Tue, Oct 28, 2025 at 5:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I have noticed a few minor points:
>

Some more comments:
1.
-  if (RecoveryInProgress())
-  {
-    /*
-     * This check may have race conditions, but whenever
-     * XLOG_PARAMETER_CHANGE indicates that wal_level has changed, we
-     * verify that there are no existing logical replication slots. And to
-     * avoid races around creating a new slot,
-     * CheckLogicalDecodingRequirements() is called once before creating
-     * the slot, and once when logical decoding is initially starting up.
-     */
-    if (GetActiveWalLevelOnStandby() < WAL_LEVEL_LOGICAL)
-      ereport(ERROR,
-          (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-           errmsg("logical decoding on standby requires \"wal_level\"
>= \"logical\" on the primary")));
-  }
+  /* CheckSlotRequirements() has already checked if wal_level >= 'replica' */
+
+  /*
+   * Check if logical decoding is available on standbys. Note that
+   * LogicalDecodingStatusChangeAllowed() returning false implies the
+   * recovery is in-progress. See the comments atop logicalctl.c for
+   * details.
+   */
+  if (!IsLogicalDecodingEnabled() && !LogicalDecodingStatusChangeAllowed())


In this change, code and comments without the patch are easy to
follow. IIUC, this is primarily to allow slot creation during the
promotion. How about something like:
"Check if logical decoding is available on standbys. Typically, when
running a standby, RecoveryInProgress() returning true implies that
LogicalDecodingStatusChangeAllowed() is false. However, during
promotion, there's a brief transitional phase where
RecoveryInProgress() remains true even though
LogicalDecodingStatusChangeAllowed() has already turned true.

In this window, logical decoding enable/disable operations are
permitted on standby, anticipating its transition to primary. The
actual wait for recovery completion is handled within
start_logical_decoding_status_change()."

2
+ * Standby servers inherit the logical decoding and logical WAL writing status
+ * from the primary server.

It sounds a bit odd to say that standby inherits logical WAL writing
status from primary, as there is no WAL writing happening on standby.

3.
other processes might attempt to
+ * enable or disable logical decoding before recovery fully completes

Instead of enable or disable, it would be better to say toggle logical …

4.
+/*
+ * Check the shared memory state and return true if logical information WAL
+ * logging is enabled.
+ */

In this "...logical information WAL logging is enabled." sounds a bit
odd. How about something like: "Returns true if logical WAL logging is
enabled based on the shared memory state."?

5.
EnsureLogicalDecodingEnabled() and DisableLogicalDecodingIfNecessary()
+ * should be used instead if there could be concurrent processes doing writes
+ * or logical decoding.
+ */
+void
+UpdateLogicalDecodingStatus(bool new_status, bool need_lock)
…

Can we ensure what is written in the above comment by an Assert?

6.
+DisableLogicalDecodingIfNecessary(void)
+{
+ bool pending_disable;
+
+ if (wal_level != WAL_LEVEL_REPLICA)
+ return;
+
+ /*
+ * Sanity check as we cannot disable logical decoding while holding a
+ * logical slot.
+ */
+ Assert(!MyReplicationSlot);

Is this assertion relevant now, considering we disable
logical_decoding only via checkpointer?

7.
- if (SlotIsLogical(s))
- nlogicalslots++;
+ if (SlotIsLogical(s) && s->data.invalidated == RS_INVAL_NONE)
+ n_valid_logicalslots++;

Is it okay to check the invalidated flag without the spinlock? If so,
it is better to write a comment as to why it is safe.

8.
@@ -136,6 +137,12 @@ create_logical_replication_slot(char *name, char *plugin,
    temporary ? RS_TEMPORARY : RS_EPHEMERAL, two_phase,
    failover, false);

+ /*
+ * Ensure the logical decoding is enabled before initializing the logical
+ * decoding context.
+ */
+ EnsureLogicalDecodingEnabled();
…

EnsureLogicalDecodingEnabled(void)
{
Assert(MyReplicationSlot);

if (wal_level != WAL_LEVEL_REPLICA)
return;

/* Prepare and start the activation process if it's disabled */
if (!start_logical_decoding_status_change(true))
return;

I see that EnsureLogicalDecodingEnabled() sometimes returns without
enabling decoding (like when wal_level is not a replica). It is
possible that the same is not possible in this code flow, but won't it
be better to get the return value and assert if this function returns
false?

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Dagfinn Ilmari Mannsåker
Date:
Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
Next
From: Xuneng Zhou
Date:
Subject: Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array