Re: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Date | |
Msg-id | CAD21AoAA=zuiajwXgXSCYQWo=6oY-=CGLaEqvpfNUTVsLen+CA@mail.gmail.com Whole thread Raw |
In response to | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart (Masahiko Sawada <sawada.mshk@gmail.com>) |
List | pgsql-hackers |
On Thu, Feb 20, 2025 at 10:05 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Wed, Feb 19, 2025 at 1:56 AM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Hi, > > Thank you for looking at the patches. > > > > > On Mon, Feb 17, 2025 at 12:07:56PM -0800, Masahiko Sawada wrote: > > > On Fri, Feb 14, 2025 at 2:35 AM Bertrand Drouvot > > > <bertranddrouvot.pg@gmail.com> wrote: > > > > Yeap, that was exactly my point when I mentioned the custodian thread (taking > > > > into account Tom's comment quoted above). > > > > > > > > > > I've written PoC patches to have the online wal_level change work use > > > a more generic infrastructure. These patches are still in PoC state > > > but seem like a good direction to me. Here is a brief explanation for > > > each patch. > > > > Thanks for the patches! > > > > > * The 0001 patch introduces "reserved background worker slots". We > > > allocate max_process_workers + BGWORKER_CLASS_RESERVED at startup, and > > > if the number of running bgworker exceeds max_worker_processes, only > > > workers using the reserved slots can be launched. We can request to > > > use the reserved slots by adding BGWORKER_CLASS_RESERVED flag at > > > bgworker registration. > > > > I had a quick look at 0001 and I think the way that's implemented is reasonnable. > > I thought this could be defined through a GUC so that extensions can benefit > > from it. But OTOH the core code should ensure the value is > as the number of > > reserved slots needed by the core so not using a GUC looks ok to me. > > Interesting idea. I kept the reserved slots only for internal use but > it would be worth considering to use GUC instead. > > > > * The 0002 patch introduces "bgtask worker". The bgtask infrastructure > > > is designed to execute internal tasks in background in > > > one-worker-per-one-task style. Internally, bgtask workers use the > > > reserved bgworker so it's guaranteed that they can launch. > > > > Yeah. > > > > > The > > > internal tasks that we can request are predefined and this patch has a > > > dummy task as a placeholder. This patch implements only the minimal > > > functionality for the online wal_level change work. I've not tested if > > > this bgtask infrastructure can be used for tasks that we wanted to > > > offload to the custodian worker. > > > > Again, I had a quick look and looks simple enough of our need here. It "just" > > executes "(void) InternalBgTasks[type].func()" and then exists. That's, I think, > > a good starting point to add more tasks in the future (if we want to). > > Yeah, we might want to extend it further, for example to pass an > argument to the background task or to ask multiple tasks for the > single bgtask worker. As far as I can read the custodian patch set, > the work of removing temp files seems not to require any argument > though. > > > > > > * The 0003 patch makes wal_level a SIGHUP parameter. We do the online > > > wal_level change work using the bgtask infrastructure. There are no > > > major changes from the previous version other than that. > > > > It replaces the dummy task introduced in 0002 by the one that suits our needs > > here (through the new BgTaskWalLevelChange() function). > > > > The design looks reasonable to me. Waiting to see if others disagree before > > looking more closely at the code. > > Thanks. I would like to discuss behavioral and user interface considerations. Upon further analysis of this patch regarding the conversion of wal_level to a SIGHUP parameter, I find that supporting all combinations of wal_level value changes might make less sense. Specifically, changing to or from 'minimal' would necessitate a checkpoint, and reducing wal_level to 'minimal' would require terminating physical replication, WAL archiving, and online backups. While these operations demand careful consideration, there seems to be no compelling use case for decreasing to 'minimal'. Furthermore, increasing wal_level from 'minimal' is typically a one-time operation during a database's lifetime. Therefore, we should weigh the benefits against the implementation complexity. One solution is to manage the effective WAL level using two distinct GUC parameters: max_wal_level and wal_level. max_wal_level would be a POSTMASTER parameter controlling the system's maximum allowable WAL level, with values 'minimal', 'replica', and 'logical'. wal_level would function as a SIGHUP parameter managing the runtime WAL level, accepting values 'replica', 'logical', and 'auto'. The selected value must be either 'auto' or not exceed max_wal_level. When set to 'auto', wal_level automatically synchronizes with max_wal_level's value. This approach would enable online WAL level transitions between 'replica' and 'logical'. Regarding logical decoding on standbys, currently both primary and standby servers must have wal_level set to 'logical'. We need to determine the appropriate behavior when users decrease the WAL level from 'logical' to 'replica' through configuration file reload. One approach would be to invalidate all logical replication slots on the standby when transitioning to 'replica' WAL level. Although incoming WAL records from the primary would still be written at 'logical' level, making logical decoding technically feasible, this behavior seems logical as it reflects the user's intent to discontinue logical decoding on the standby. For consistency, we might need to invalidate logical slots during server startup if the WAL level is insufficient. Alternatively, we could permit logical decoding on the standby even with wal_level set to 'replica'. However, this would necessitate invalidating all logical replication slots during promotion, potentially extending downtime during failover. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: