Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Date | |
Msg-id | CAHut+PtB4zZrMKm_vUd-JO29Uq2479unx1KR=hGu=sQxfSVXJg@mail.gmail.com Whole thread Raw |
In response to | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart (Masahiko Sawada <sawada.mshk@gmail.com>) |
List | pgsql-hackers |
Hi Sawada-San. Here are some mostly cosmetic review comments for patch v19 (excluding test code). ====== doc/src/sgml/logicaldecoding.sgml 1. + <para> + If either condition is met, the operational WAL level becomes equivalent + to <literal>logical</literal>, which can be monitored through + <xref linkend="guc-effective-wal-level"/> parametr. + </para> Typo? /through/through the/ Typo: /parametr/parameter/ ~~~ 2. + <para> + When <varname>wal_level</varname> is set to <literal>replica</literal>, + logical decoding is automatically activated upon creation of the first + logical replication slot. This activation process involves several steps + and requires synchronization among processes, ensuring system-wide + onsistency. Conversely, when the last logical replication slot is dropped Typo: /onsistency/consistency/ ~~~ 3. + <caution> + <para> + When <varname>wal_level</varname> is set to <literal>replica</literal>, + dropping the last logical slot may disable logical decoding on primary, + resulting in slots on standbys being invalidated. + </para> + </caution> Typo? /on primary/on the primary/ ====== doc/src/sgml/system-views.sgml 4. <para> <literal>wal_level_insufficient</literal> means that the - primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to - perform logical decoding. It is set only for logical slots. + logical decoding is disabled on primary (See + <xref linkend="logicaldecoding-explanation-log-dec"/>). It is set + only for logical slots. </para> Typo? /on primary/on the primary/ ====== src/backend/access/transam/xlog.c 5. + * XXX: We keep the slotsync worker running even after logical + * decoding is disabled for simplicity. A future improvement + * can consider starting and stopping the worker based on + * logical decoding status change. First sentence might be better written as: For simplicity, we keep the slotsync worker running even after logical decoding is disabled. ====== src/backend/commands/publicationcmds.c 6. + errmsg("logical decoding should be allowed to publish logical changes"), + errhint("Before creating subscriptions, set \"wal_level\" >= \"logical\" or create a logical replication slot when \"wal_level\" = \"replica\"."))); 6a. I still felt this this errmsg wording is inconsistent -- e.g. You have not used the term "allowed" anywhere else in the patch; mostly logical decoding is said to be enabled/disabled. Maybe: "to publish logical changes the logical decoding must be enabled" Or simply: "local decoding is not enabled" ~~ 6b. I'm not sure that you really needed to say "or create a logical replication slot when wal_level = replica". Because, the CREATE SUBSCRIPTION it refers to would be creating that logical slot anyway, won't it? So maybe the errhint only need to say: Set \"wal_level\" >= \"replica\". ====== src/backend/replication/logical/logicalctl.c 7. + * This module enables dynamic control of logical decoding availability. + * Logical decoding becomes active under two conditions: when the wal_level + * parameter is set to 'logical', or when at least one logical replication + * slot exists with wal_level set to 'replica'. The system disables logical + * decoding when neither condition is met. Therefore, the dynamic control + * of logical decoding availability is required only when wal_level is set + * to 'replica'. With 'logical' WAL level, logical decoding is always enabled + * whereas with 'minimal' WAL level, it's always disabled. I thought it is more easy/consistent to refer always wal_level instead of sometimes referring to WAL level. SUGGESTION (for last sentence) Logical decoding is always enabled when wal_level='logical' and always disabled when wal_level='minimal'. ~~~ 8. + *------------------------------------------------------------------------- + */ +#include "postgres.h" Missing blank line before the #includes. ~~~ StartupLogicalDecodingStatus: 9. + if (last_status) + { + LogicalDecodingCtl->xlog_logical_info = true; + LogicalDecodingCtl->logical_decoding_enabled = true; + } Is the 'if' needed? It could have just said: LogicalDecodingCtl->xlog_logical_info = last_status; LogicalDecodingCtl->logical_decoding_enabled = last_status; Or, why not just delegate to the other function: UpdateLogicalDecodingStatus(last_status, false); ~~~ start_logical_decoding_status_change: 10. + /* + * Wait for the recovery to complete. Note that even the checkpointer + * can wait for the recovery to complete here without concerning + * deadlocks unless the startup process doesn't perform any action + * that waits for it after calling + * UpdateLogicalDecodingStatusEndOfRecovery(). + */ Is the wording correct? The "unless ... doesn't" seems backwards. ~~~ LogicalDecodingStatusChangeAllowed: 11. + /* + * We check the current status to see if we need to change it. If a status + * change is in-progress, we need to wait for completion. + */ + if (LogicalDecodingCtl->status_change_inprogress) Only the 2nd sentence of the comment seems relevant to this code. AFAICT the 1st sentence belongs to logic that comes later in this function. ~~~ RequestDisableLogicalDecoding: 12. +/* + * Initiate a request for disabling logical decoding. + * + * This function expects to be after dropping the possibly-last logical + * replication slot as it doesn't check the logical slot presence. + */ Typo? /expects to be after/expects to be called after/ ~~~ 13. + /* + * Check if the status change is allowed before initiating a deactivation + * request, to avoid unnecessary work. + */ + if (!LogicalDecodingStatusChangeAllowed()) + return; IMO call this the "disable request" (same as the function comment; same as the function name) instead of introducing new terminolgy like "deactivation request". ~~~ DisableLogicalDecodingIfNecessary: 14. + /* + * When disabling logical decoding, we need to disable logical decoding + * first and disable logical information WAL logging in order to ensure + * that no logical decoding processes WAL records with insufficient + * information. + */ wording? ~~~ 15. + /* + * Order all running processes to reflect the xlog_logical_info update. + * Unlike when enabling logical decoding, we don't need to wait for all + * processes to complete it in this case. We already disabled logical + * decoding and it's always safe to write logical information to WAL + * records, even when not strictly required. Therefore, we don't need to + * wait for all running transactions to finish either. + */ Order makes me think about sorting. Maybe "Tell" instead of "Order" is clearer. ~~~ UpdateLogicalDecodingStatusEndOfRecovery: 16. + /* + * We can use logical decoding if we're using 'logical' WAL level or there + * is at least one logical replication slot. + */ + if (wal_level == WAL_LEVEL_LOGICAL || CheckLogicalSlotExists()) + new_status = true; + + if (LogicalDecodingCtl->logical_decoding_enabled != new_status) + need_wal = true; 16a. The comment would be clearer to say similar to lots of other places: SUGGESTION We can use logical decoding if wal_level=logical, or wal_level=replica and there is at least one logical replication slot. ~ 16b. The conditions are not really needed. Maybe it is easier to write as below: new_status = (wal_level == WAL_LEVEL_LOGICAL || CheckLogicalSlotExists()); need_wal = (LogicalDecodingCtl->logical_decoding_enabled != new_status); ====== Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: