Re: recovery modules - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: recovery modules
Date
Msg-id 20230128005910.GA2245287@nathanxps13
Whole thread Raw
In response to Re: recovery modules  (Andres Freund <andres@anarazel.de>)
Responses Re: recovery modules
List pgsql-hackers
On Fri, Jan 27, 2023 at 04:23:19PM -0800, Andres Freund wrote:
>> +typedef bool (*RecoveryRestoreCB) (const char *file, const char *path,
>> +                                   const char *lastRestartPointFileName);
>> +typedef void (*RecoveryArchiveCleanupCB) (const char *lastRestartPointFileName);
>> +typedef void (*RecoveryEndCB) (const char *lastRestartPointFileName);
>> +typedef void (*RecoveryShutdownCB) (void);
> 
> I think the signature of these forces bad coding practices, because there's no
> way to have state within a recovery module (as there's no parameter for it).
> 
> It's possible we would eventually support multiple modules, e.g. restoring
> from shorter term file based archiving and from longer term archiving in some
> blob store. Then we'll regret not having a varible for this.

Are you suggesting that we add a "void *arg" to each one of these?  Or put
the arguments into a struct?  Or something else?

>> +extern RecoveryModuleCallbacks RecoveryContext;
> 
> I think that'll typically be interpreteted as a MemoryContext by readers.

How about RecoveryCallbacks?

> Also, why is this a global var? Exported too?

It's needed in xlog.c, xlogarchive.c, and xlogrecovery.c.  Would you rather
it be static to xlogarchive.c and provide accessors for the others?

>> +/*
>> + * Type of the shared library symbol _PG_recovery_module_init that is looked up
>> + * when loading a recovery library.
>> + */
>> +typedef void (*RecoveryModuleInit) (RecoveryModuleCallbacks *cb);
> 
> I think this is a bad way to return callbacks. This way the
> RecoveryModuleCallbacks needs to be modifiable, which makes the job for the
> compiler harder (and isn't the greatest for security).
> 
> I strongly encourage you to follow the model used e.g. by tableam. The init
> function should return a pointer to a *constant* struct. Which is compile-time
> initialized with the function pointers.
> 
> See the bottom of heapam_handler.c for how that ends up looking.

Hm.  I used the existing strategy for archive modules and logical decoding
output plugins here.  I think it would be weird for the archive module and
recovery module interfaces to look so different, but if that's okay, I can
change it.

>> +void
>> +LoadRecoveryCallbacks(void)
>> +{
>> +    RecoveryModuleInit init;
>> +
>> +    /*
>> +     * If the shell command is enabled, use our special initialization
>> +     * function.  Otherwise, load the library and call its
>> +     * _PG_recovery_module_init().
>> +     */
>> +    if (restoreLibrary[0] == '\0')
>> +        init = shell_restore_init;
>> +    else
>> +        init = (RecoveryModuleInit)
>> +            load_external_function(restoreLibrary, "_PG_recovery_module_init",
>> +                                   false, NULL);
> 
> Why a special rule for shell, instead of just defaulting the GUC to it?

I'm not following this one.  The default value of the restore_library GUC
is an empty string, which means that the shell commands should be used.

>> +    /*
>> +     * If using shell commands, remove callbacks for any commands that are not
>> +     * set.
>> +     */
>> +    if (restoreLibrary[0] == '\0')
>> +    {
>> +        if (recoveryRestoreCommand[0] == '\0')
>> +            RecoveryContext.restore_cb = NULL;
>> +        if (archiveCleanupCommand[0] == '\0')
>> +            RecoveryContext.archive_cleanup_cb = NULL;
>> +        if (recoveryEndCommand[0] == '\0')
>> +            RecoveryContext.recovery_end_cb = NULL;
> 
> I'd just mandate that these are implemented and that the module has to handle
> if it doesn't need to do anything.

Wouldn't this just force module authors to write empty functions for the
parts they don't need?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Optimizing PostgreSQL with LLVM's PGO+LTO
Next
From: Andres Freund
Date:
Subject: Re: Small omission in type_sanity.sql