Re: Duplicate history file? - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Duplicate history file?
Date
Msg-id CAOuzzgpKc7-OuLOfOGFSZrAPgxAy5xkVjY63VYS8O86HPVWKmg@mail.gmail.com
Whole thread Raw
In response to Re: Duplicate history file?  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-hackers
Greetings,

On Tue, Jun 15, 2021 at 21:11 Julien Rouhaud <rjuju123@gmail.com> wrote:
On Tue, Jun 15, 2021 at 02:28:04PM -0400, Stephen Frost wrote:
>
> * Julien Rouhaud (rjuju123@gmail.com) wrote:
> > On Tue, Jun 15, 2021 at 11:33:10AM -0400, Stephen Frost wrote:
> >
> > The fact that this is such a complex problem is the very reason why we should
> > spend a lot of energy documenting the various requirements.  Otherwise, how
> > could anyone implement a valid program for that and how could anyone validate
> > that a solution claiming to do its job actually does its job?
>
> Reading the code.

Oh, if it's as simple as that then surely documenting the various requirements
won't be an issue.

As I suggested previously- this is similar to the hooks that we provide. We don’t extensively document them because if you’re writing an extension which uses a hook, you’re going to be (or should be..) reading the code too.

Consider that, really, an archive command should refuse to allow archiving of WAL on a timeline which doesn’t have a corresponding history file in the archive for that timeline (excluding timeline 1). Also, a backup tool should compare the result of pg_start_backup to what’s in the control file, using a fresh read, after start backup returns to make sure that the storage is sane and not, say, cache’ing pages independently (such as might happen with a separate NFS mount..).  Oh, and if a replica is involved, a check should be done to see if the replica has changed timelines and an appropriate message thrown if that happens complaining that the backup was aborted due to the promotion of the replica…

To be clear- these aren’t checks that pgbackrest has today and I’m not trying to make it out as if pgbackrest is the only solution and the only tool that “does everything and is correct” because we aren’t there yet and I’m not sure we ever will be “all correct” or “done”.

These, however, are ones we have planned to add because of things we’ve seen and thought of, most of them in just the past few months.

Thanks,

Stephen

pgsql-hackers by date:

Previous
From: David Christensen
Date:
Subject: Re: [PATCH] expand the units that pg_size_pretty supports on output
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Duplicate history file?