Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Date | |
Msg-id | CAD21AoAQJ0LuRYuj0g8-uB9Qtns88HK_TVdoa5jmX3ZPBK9gvw@mail.gmail.com Whole thread Raw |
In response to | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart (Peter Smith <smithpb2250@gmail.com>) |
Responses |
Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
|
List | pgsql-hackers |
On Wed, Oct 15, 2025 at 12:57 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Sawada-San, > > Here are some more review comments for v18. > > (WIP) > > ====== > src/backend/access/transam/xact.c > > MarkCurrentTransactionIdLoggedIfAny: > > 1. > - * This returns true if wal_level >= logical and we are inside a valid > - * subtransaction, for which the assignment was not yet written to any WAL > - * record. > + * This returns true if effective WAL level is logical and we are inside > + * a valid subtransaction, for which the assignment was not yet written to > + * any WAL record. > > Maybe it is better to always substitute 'wal_level' with > 'effective_wal_level' for consistency? > > ~~~ > > 2. > - /* wal_level has to be logical */ > + /* effective WAL level has to be logical */ > > ditto: maybe just say: "effective_wal_level has to be logical". > > ~~~ > > 3. > + * When effective WAL level is logical, guarantee that a subtransaction's > + * xid can only be seen in the WAL stream if its toplevel xid has been > + * logged before. If necessary we log an xact_assignment record with fewer > > ditto: maybe say "When effective_wal_level is logical..." Fixed. > > ====== > src/backend/access/transam/xlog.c > > 4. > +const char * > +show_effective_wal_level(void) > +{ > + char *str; > + > + if (wal_level == WAL_LEVEL_MINIMAL) > + return "minimal"; > + > + /* > + * During the recovery, we need to check the shared status instead of the > + * local XLogLogicalInfo cache as we don't synchronously update it during > + * the recovery. > + */ > + if (RecoveryInProgress()) > + return IsXLogLogicalInfoEnabled() ? "logical" : "replica"; > + > + if (wal_level == WAL_LEVEL_REPLICA) > + { > + /* > + * With wal_level='replica', XLogLogicalInfo indicates the actual WAL > + * level. > + */ > + if (IsXLogLogicalInfoEnabled()) > + str = "logical"; > + else > + str = "replica"; > + } > + else > + str = "logical"; > + > + return str; > +} > > It would be simpler and have more consistent ternary usage to have > direct returns in all places instead of just half or them. Then remove > the 'str' var: > > SUGGESTION > ... > if (wal_level == WAL_LEVEL_MINIMAL) > return "minimal"; > ... > if (RecoveryInProgress()) > return IsXLogLogicalInfoEnabled() ? "logical" : "replica"; > ... > if (wal_level == WAL_LEVEL_REPLICA) > return IsXLogLogicalInfoEnabled() ? "logical" : "replica"; > ... > return "logical"; Thank you for the suggestion! I reconsidered and improved the code flow of that function. > > ====== > src/backend/commands/publicationcmds.c > > 5. > - if (wal_level != WAL_LEVEL_LOGICAL) > + if (!IsLogicalDecodingEnabled()) > ereport(WARNING, > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > - errmsg("\"wal_level\" is insufficient to publish logical changes"), > - errhint("Set \"wal_level\" to \"logical\" before creating subscriptions."))); > + errmsg("logical decoding should be allowed to publish logical changes"), > + errhint("Before creating subscriptions, set \"wal_level\" >= > \"logical\" or create a logical replication slot when \"wal_level\" = > \"replica\"."))); > > Why not just say errmsg "\"effective_wal_level\" is insufficient to > publish logical changes" I think that saying "logical decoding should be allowed" sounds more understandable for users than saying "effective_wal_level is insufficient". > > ====== > src/backend/commands/tablecmds.c > > 6. > - /* should only get here if wal_level >= logical */ > + /* should only get here if effective WAL level is 'logical' */ > Assert(XLogLogicalInfoActive()); > ditto from earlier: say "should only get here if effective_wal_level is logical" Fixed. > > ====== > src/backend/postmaster/checkpointer.c > > 7. > + /* > + * Disable logical decoding if someone requested to disable logical > + * decoding. See comments atop logicalctl.c. > + */ > + DisableLogicalDecodingIfNecessary(); > + > > The comment seems a bit repetitive. > > SUGGESTION > Disable logical decoding if someone requested it. See comments atop > logicalctl.c. Fixed. > > ====== > src/backend/replication/logical/decode.c > > 8. > - errmsg("logical decoding on standby requires \"wal_level\" >= > \"logical\" on the primary"))); > + errmsg("logical decoding on standby requires \"effective_wal_level\" > >= \"logical\" on the primary"))); > > Is that >= really necessary? > > Why not say: "... requires \"effective_wal_level\" to be \"logical\" on the .." > > ====== > src/backend/replication/logical/logical.c > > 9. > + errmsg("logical decoding on standby requires \"effective_wal_level\" > >= \"logical\" on the primary"), > + errhint("Set \"wal_level\" >= \"logical\" or create at least one > logical slot when \"wal_level\" = \"replica\"."))); > > ditto above. Those >= seem to make it more complex than necessary. > > errmsg can be: "...requires \"effective_wal_level\" to be \"logical\" > on the ..." > errhint can be: "Set \"wal_level\" = \"logical\" or..." > > ====== > src/backend/replication/logical/slotsync.c > > 10. > ereport(elevel, > errcode(ERRCODE_INVALID_PARAMETER_VALUE), > - errmsg("replication slot synchronization requires \"wal_level\" >= > \"logical\"")); > + errmsg("replication slot synchronization requires > \"effective_wal_level\" >= \"logical\" on the primary"), > + errhint("To enable logical decoding on primary, set \"wal_level\" >= > \"logical\" or create at least one logical slot when \"wal_level\" = > \"replica\".")); > > ditto above. Those >= seem to make it more complex than necessary. > > errmsg can be: "...requires \"effective_wal_level\" to be \"logical\" > on the ..." > errhint can be: "...set \"wal_level\" = \"logical\" or..." It seems we're already using >= in many places even before the patch: src/backend/access/transam/xlogfuncs.c: errmsg("pg_log_standby_snapshot() can only be used if \"wal_level\" >= \"replica\""))); src/backend/postmaster/postmaster.c: (errmsg("replication slot synchronization (\"sync_replication_slots\" = on) requires \"wal_level\" >= \"logical\""))); src/backend/replication/logical/decode.c: errmsg("logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary"))); src/backend/replication/logical/logical.c: errmsg("logical decoding requires \"wal_level\" >= \"logical\""))); src/backend/replication/logical/logical.c: errmsg("logical decoding on standby requires \"wal_level\" >= \"logical\" on the primary"))); src/backend/replication/logical/slotsync.c: errmsg("replication slot synchronization requires \"wal_level\" >= \"logical\"")); I agree that it makes the message more complex than necessary. I imagine that we used that style since we can avoid editing these messages when we introduce a higher level than 'logical', but I think it's likely to adjust the messages in that case anyway. So how about discussing this topic in another thread to simply improve the user-faced error messages? If we can agree on such changes, this patch can follow the new style. > > ====== > src/backend/replication/slot.c > > ReplicationSlotCleanup: > > 11. > + if (SlotIsLogical(s)) > + nlogicalslots++; > + > > How about assign SlotIsLogical(s) to a var like other places, instead > of calling it 2x? SlotIsLogical() is a macro so we don't necessarily need to avoid calling multiple times in terms of performance. Even for readability, the current style looks reasonable to me. > > ~~ > > ReportSlotInvalidation: > > 12. > case RS_INVAL_WAL_LEVEL: > - appendStringInfoString(&err_detail, _("Logical decoding on standby > requires \"wal_level\" >= \"logical\" on the primary server.")); > + appendStringInfoString(&err_detail, _("Logical decoding on standby > requires the primary server to have either \"wal_level\" >= > \"logical\" or at least one logical slot created.")); > > > 12a. > ditto previous comments about the >= Please refer to my comment to your similar comments above. > > ~ > > 12b. > This error message does not seem to be as good as previous ones. e.g. > I think it should be mentioning about" replica", because creating > slots with "minimal" is no good, right? Fixed. > > ~~~ > > InvalidateObsoleteReplicationSlots: > > 13. > - * - RS_INVAL_WAL_LEVEL: is logical and wal_level is insufficient > + * - RS_INVAL_WAL_LEVEL: is logical and logical decoding is disabled due to > + * insufficient WAL level or no logical slot presence. > > Maybe it is just easier to say: "...is logical and effective_wal_level > is not logical" > Agreed, fixed. > ====== > src/backend/storage/ipc/standby.c > > 14. > - if (wal_level < WAL_LEVEL_LOGICAL) > + if (!IsLogicalDecodingEnabled()) > LWLockRelease(ProcArrayLock); > > recptr = LogCurrentRunningXacts(running); > > /* Release lock if we kept it longer ... */ > - if (wal_level >= WAL_LEVEL_LOGICAL) > + if (IsLogicalDecodingEnabled()) > LWLockRelease(ProcArrayLock); > > ~ > > Would it be better to assign IsLogicalDecodingEnabled() to a var here > instead of 2x calls? Yeah, I think it should do that since it seems possible that logical decoding status can change between two calls, resulting in not releasing the ProcArrayLock. > ====== > src/backend/utils/cache/inval.c > > 15. > + * When effective WAL level is 'logical', write invalidations into WAL at > + * each command end to support the decoding of the in-progress transactions. > + * See CommandEndInvalidationMessages. > > ditto some previous review comment: Say "when effective_wal_level is > logical, write..." Fixed. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: