Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Date | |
Msg-id | CAD21AoBb-HPSnAdyr2w1vr9hHX+jckMxSYB2JV6As=10ysingg@mail.gmail.com Whole thread Raw |
In response to | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart (Peter Smith <smithpb2250@gmail.com>) |
List | pgsql-hackers |
On Mon, Oct 20, 2025 at 10:28 PM Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Sawada-San. > > Here are some mostly cosmetic review comments for patch v19 (excluding > test code). Thank you for reviewing the patch! > > ====== > doc/src/sgml/logicaldecoding.sgml > > 1. > + <para> > + If either condition is met, the operational WAL level becomes equivalent > + to <literal>logical</literal>, which can be monitored through > + <xref linkend="guc-effective-wal-level"/> parametr. > + </para> > > Typo? /through/through the/ > > Typo: /parametr/parameter/ > > ~~~ > > 2. > + <para> > + When <varname>wal_level</varname> is set to <literal>replica</literal>, > + logical decoding is automatically activated upon creation of the first > + logical replication slot. This activation process involves several steps > + and requires synchronization among processes, ensuring system-wide > + onsistency. Conversely, when the last logical replication slot is dropped > > Typo: /onsistency/consistency/ > > ~~~ > > 3. > + <caution> > + <para> > + When <varname>wal_level</varname> is set to <literal>replica</literal>, > + dropping the last logical slot may disable logical decoding on primary, > + resulting in slots on standbys being invalidated. > + </para> > + </caution> > > Typo? /on primary/on the primary/ > > ====== > doc/src/sgml/system-views.sgml > > 4. > <para> > <literal>wal_level_insufficient</literal> means that the > - primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to > - perform logical decoding. It is set only for logical slots. > + logical decoding is disabled on primary (See > + <xref linkend="logicaldecoding-explanation-log-dec"/>). It is set > + only for logical slots. > </para> > > Typo? /on primary/on the primary/ > > ====== > src/backend/access/transam/xlog.c > > 5. > + * XXX: We keep the slotsync worker running even after logical > + * decoding is disabled for simplicity. A future improvement > + * can consider starting and stopping the worker based on > + * logical decoding status change. > > First sentence might be better written as: > For simplicity, we keep the slotsync worker running even after logical > decoding is disabled. > > ====== > src/backend/commands/publicationcmds.c > > 6. > + errmsg("logical decoding should be allowed to publish logical changes"), > + errhint("Before creating subscriptions, set \"wal_level\" >= > \"logical\" or create a logical replication slot when \"wal_level\" = > \"replica\"."))); > > 6a. > I still felt this this errmsg wording is inconsistent -- e.g. You have > not used the term "allowed" anywhere else in the patch; mostly logical > decoding is said to be enabled/disabled. > > Maybe: > "to publish logical changes the logical decoding must be enabled" > > Or simply: > "local decoding is not enabled" > > ~~ > > 6b. > I'm not sure that you really needed to say "or create a logical > replication slot when wal_level = replica". Because, the CREATE > SUBSCRIPTION it refers to would be creating that logical slot anyway, > won't it? > > So maybe the errhint only need to say: > Set \"wal_level\" >= \"replica\". > > ====== > src/backend/replication/logical/logicalctl.c > > 7. > + * This module enables dynamic control of logical decoding availability. > + * Logical decoding becomes active under two conditions: when the wal_level > + * parameter is set to 'logical', or when at least one logical replication > + * slot exists with wal_level set to 'replica'. The system disables logical > + * decoding when neither condition is met. Therefore, the dynamic control > + * of logical decoding availability is required only when wal_level is set > + * to 'replica'. With 'logical' WAL level, logical decoding is always enabled > + * whereas with 'minimal' WAL level, it's always disabled. > > I thought it is more easy/consistent to refer always wal_level instead > of sometimes referring to WAL level. > > SUGGESTION (for last sentence) > Logical decoding is always enabled when wal_level='logical' and always > disabled when wal_level='minimal'. > > ~~~ > > 8. > + *------------------------------------------------------------------------- > + */ > +#include "postgres.h" > > Missing blank line before the #includes. > > ~~~ > > StartupLogicalDecodingStatus: > > 9. > + if (last_status) > + { > + LogicalDecodingCtl->xlog_logical_info = true; > + LogicalDecodingCtl->logical_decoding_enabled = true; > + } > > Is the 'if' needed? It could have just said: > LogicalDecodingCtl->xlog_logical_info = last_status; > LogicalDecodingCtl->logical_decoding_enabled = last_status; > > Or, why not just delegate to the other function: > UpdateLogicalDecodingStatus(last_status, false); > > ~~~ > > start_logical_decoding_status_change: > > 10. > + /* > + * Wait for the recovery to complete. Note that even the checkpointer > + * can wait for the recovery to complete here without concerning > + * deadlocks unless the startup process doesn't perform any action > + * that waits for it after calling > + * UpdateLogicalDecodingStatusEndOfRecovery(). > + */ > > Is the wording correct? The "unless ... doesn't" seems backwards. > > ~~~ > > LogicalDecodingStatusChangeAllowed: > > 11. > + /* > + * We check the current status to see if we need to change it. If a status > + * change is in-progress, we need to wait for completion. > + */ > + if (LogicalDecodingCtl->status_change_inprogress) > > Only the 2nd sentence of the comment seems relevant to this code. > AFAICT the 1st sentence belongs to logic that comes later in this > function. > > ~~~ > > RequestDisableLogicalDecoding: > > 12. > +/* > + * Initiate a request for disabling logical decoding. > + * > + * This function expects to be after dropping the possibly-last logical > + * replication slot as it doesn't check the logical slot presence. > + */ > > Typo? /expects to be after/expects to be called after/ > > ~~~ > > 13. > + /* > + * Check if the status change is allowed before initiating a deactivation > + * request, to avoid unnecessary work. > + */ > + if (!LogicalDecodingStatusChangeAllowed()) > + return; > > IMO call this the "disable request" (same as the function comment; > same as the function name) instead of introducing new terminolgy like > "deactivation request". > > ~~~ > > DisableLogicalDecodingIfNecessary: > > 14. > + /* > + * When disabling logical decoding, we need to disable logical decoding > + * first and disable logical information WAL logging in order to ensure > + * that no logical decoding processes WAL records with insufficient > + * information. > + */ > > wording? > > ~~~ > > 15. > + /* > + * Order all running processes to reflect the xlog_logical_info update. > + * Unlike when enabling logical decoding, we don't need to wait for all > + * processes to complete it in this case. We already disabled logical > + * decoding and it's always safe to write logical information to WAL > + * records, even when not strictly required. Therefore, we don't need to > + * wait for all running transactions to finish either. > + */ > > Order makes me think about sorting. Maybe "Tell" instead of "Order" is clearer. > > ~~~ > > UpdateLogicalDecodingStatusEndOfRecovery: > > 16. > + /* > + * We can use logical decoding if we're using 'logical' WAL level or there > + * is at least one logical replication slot. > + */ > + if (wal_level == WAL_LEVEL_LOGICAL || CheckLogicalSlotExists()) > + new_status = true; > + > + if (LogicalDecodingCtl->logical_decoding_enabled != new_status) > + need_wal = true; > > 16a. > The comment would be clearer to say similar to lots of other places: > > SUGGESTION > We can use logical decoding if wal_level=logical, or wal_level=replica > and there is at least one logical replication slot. > > ~ > > 16b. > The conditions are not really needed. Maybe it is easier to write as below: > > new_status = (wal_level == WAL_LEVEL_LOGICAL || CheckLogicalSlotExists()); > need_wal = (LogicalDecodingCtl->logical_decoding_enabled != new_status); All comments make sense to me. I intentionally left some 'if' statement for readability (suggested in 16b) but addressed other comments in the next patch. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: