Re: recovery modules - Mailing list pgsql-hackers

From Andres Freund
Subject Re: recovery modules
Date
Msg-id 20221227224530.uh4t62hryidwxeva@awork3.anarazel.de
Whole thread Raw
In response to Re: recovery modules  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: recovery modules  (Nathan Bossart <nathandbossart@gmail.com>)
List pgsql-hackers
Hi,

On 2022-12-27 14:37:11 -0800, Nathan Bossart wrote:
> On Tue, Dec 27, 2022 at 02:11:11PM -0800, Andres Freund wrote:
> > On 2022-12-27 11:24:49 -0800, Nathan Bossart wrote:
> >> I've attached a patch set that adds the restore_library,
> >> archive_cleanup_library, and recovery_end_library parameters to allow
> >> archive recovery via loadable modules.  This is a follow-up to the
> >> archive_library parameter added in v15 [0] [1].
> > 
> > Why do we need N parameters for this? To me it seems more sensible to have one
> > parameter that then allows a library to implement all these (potentially
> > optionally).
> 
> The main reason is flexibility.  Separate parameters allow using a library
> for one thing and a command for another, or different libraries for
> different things.  If that isn't a use-case we wish to support, I don't
> mind combining all three into a single recovery_library parameter.

I think the configuration complexity is a sufficient concern to not go that
direction...


> >> * Unlike archive modules, recovery libraries cannot be changed at runtime.
> >> There isn't a safe way to unload a library, and archive libraries work
> >> around this restriction by restarting the archiver process.  Since recovery
> >> libraries are loaded via the startup and checkpointer processes (which
> >> cannot be trivially restarted like the archiver), the same workaround is
> >> not feasible.
> > 
> > I don't think that's a convincing reason to not support configuration
> > changes. Sure, libraries cannot be unloaded, but an unnecessarily loaded
> > library is cheap. All that's needed is to redirect the relevant function
> > calls.
> 
> This might leave some stuff around (e.g., GUCs, background workers), but if
> that isn't a concern, I can adjust it to work as you describe.

You can still have a shutdown hook re background workers. I don't think the
GUCs matter, given that it's the startup/checkpointer processes.


> >> * pg_rewind uses restore_command, but there isn't a straightforward path to
> >> support restore_library.  I haven't addressed this in the attached patches,
> >> but perhaps this is a reason to allow specifying both restore_command and
> >> restore_library at the same time.  pg_rewind would use restore_command, and
> >> the server would use restore_library.
> > 
> > That seems problematic, leading to situations where one might not be able to
> > use restore_command anymore, because it's not feasible to do
> > segment-by-segment restoration.
> 
> I'm not following why this would make segment-by-segment restoration
> infeasible.  Would you mind elaborating?

Latency effects for example can make it infeasible to do segment-by-segment
restoration infeasible performance wise. On the most extreme end, imagine WAL
archived to tape or such.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: recovery modules
Next
From: Nathan Bossart
Date:
Subject: Re: recovery modules