Re: Weird failure with latches in curculio on v15 - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Weird failure with latches in curculio on v15
Date
Msg-id 20230201175806.GA3199959@nathanxps13
Whole thread Raw
In response to Re: Weird failure with latches in curculio on v15  (Andres Freund <andres@anarazel.de>)
Responses Re: Weird failure with latches in curculio on v15
List pgsql-hackers
On Wed, Feb 01, 2023 at 08:58:01AM -0800, Andres Freund wrote:
> On 2023-02-01 10:12:26 -0500, Tom Lane wrote:
>> The fundamental issue is that we have no good way to break out
>> of system(), and I think the original idea was that
>> in_restore_command would be set *only* for the duration of the
>> system() call.  That's clearly been lost sight of completely,
>> but maybe as a stopgap we could try to get back to that.
> 
> We could push the functions setting in_restore_command down into
> ExecuteRecoveryCommand(). But I don't think that'd end up necessarily
> being right either - we'd now use the mechanism in places we previously
> didn't (cleanup/end commands).

Right, we'd only want to set it for restore_command.  I think that's
doable.

> And there's just plenty other stuff in the 14bdb3f13de 9a740f81eb0 that
> doesn't look right:
> - We now have two places open-coding what BuildRestoreCommand did

This was done because BuildRestoreCommand() had become a thin wrapper
around replace_percent_placeholders().  I can add it back if you don't
think this was the right decision.

> - I'm doubtful that the new shell_* functions are the base for a good
>   API to abstract restoring files

Why?

> - the error message for a failed restore command seems to have gotten
>   worse:
>   could not restore file \"%s\" from archive: %s"
>   ->
>   "%s \"%s\": %s", commandName, command

Okay, I'll work on improving this message.

> - shell_* imo is not a good namespace for something called from xlog.c,
>   xlogarchive.c. I realize the intention is that shell_archive.c is
>   going to be its own "restore module", but for now it imo looks odd

What do you propose instead?  FWIW this should go away with recovery
modules.  This is just an intermediate state to simplify those patches.

> - The comment moved out of RestoreArchivedFile() doesn't seems less
>   useful at its new location

Where do you think it should go?

> - explanation of why we use GetOldestRestartPoint() is halfway lost

Okay, I'll work on adding more context here.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Show various offset arrays for heap WAL records
Next
From: Andres Freund
Date:
Subject: Re: Weird failure with latches in curculio on v15