Re: backup manifests - Mailing list pgsql-hackers

From Robert Haas
Subject Re: backup manifests
Date
Msg-id CA+TgmobjkosrwPaEA6nxhC-JoZ4JGXSiQ+27Zkm3iMY0VnMwvQ@mail.gmail.com
Whole thread Raw
In response to Re: backup manifests  (Stephen Frost <sfrost@snowman.net>)
Responses Re: backup manifests  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On Fri, Jan 3, 2020 at 12:01 PM Stephen Frost <sfrost@snowman.net> wrote:
> You're certainly intending to do *something* with the manifest, and
> while I appreciate that you feel you've come up with a complete use-case
> that this simple manifest will be sufficient for, I frankly doubt
> that'll actually be the case.  Not long ago it wasn't completely clear
> that a manifest at *all* was even going to be necessary for the specific
> use-case you had in mind (I'll admit I wasn't 100% sure myself at the
> time either), but now that we're down the road of having one, I can't
> agree with the blanket assumption that we're never going to want to
> extend it, or even that it won't be necessary to add to it before this
> particular use-case is fully addressed.
>
> And the same goes for the other things that were discussed up-thread
> regarding memory context and error handling and such.

Well, I don't know how to make you happy here. It looks to me like
insisting on a JSON-format manifest will likely mean that this doesn't
get into PG13 or PG14 or probably PG15, because a port of all that
machinery to work in frontend code will be neither simple nor quick.
If you want this to happen for this release, you've got to be willing
to settle for something that can be implemented in the time we have.

I'm not sure whether what you and David are arguing boils down to
thinking that I'm wrong when I say that doing that is hard, or whether
you know it's hard but you just don't care because you'd rather see
the feature go nowhere than use a format other than JSON. I don't see
much difference between the latter position and a desire to block the
feature permanently. And if it's the former then you have yet to make
any suggestions for how to get it done with reasonable effort.

> I'm happy to outline the other things that one *might* want to include
> in a manifest, if that would be helpful, but I'll also say that I'm not
> planning to hack on adding that to pg_basebackup in the next month or
> two.  Once we've actually got a manifest, if it's in an extendable
> format, I could certainly see people wanting to do more with it though.

Well, as I say, it's got a version number, so somebody can always come
along with something better. I really think this is a red herring,
though. If somebody wants to track additional data about a backup,
there's no rule that they have to include it in the backup manifest. A
backup management solution might want to track things like who
initiated the backup, or for what purpose it was taken, or the IP
address of the machine where it was taken, or the backup system's own
identifier, but any of that stuff could (and probably should) be
stored in a file managed by that tool rather than in the server's own
manifest.  As to the per-file information, I believe that David and I
discussed that and the list of fields that I had seemed relatively OK,
and I believe I added at least one (mtime) per his suggestion. Of
course, it's a tab-separated file; more fields could easily be added
at the end, separated by tabs. Or, you could modify the file so that
after each "File" line you had another line with supplementary
information about that file, beginning with some other word. Or, you
could convert the whole file to JSON for v2 of the manifest, if,
contrary to my belief, that's a fairly simple thing to do. There are
probably other approaches as well. This file format has already had
considerably more thought about forward-compatibility than
pg_hba.conf, which has been retrofitted multiple times without
breaking the world.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Greatest Common Divisor
Next
From: Melanie Plageman
Date:
Subject: Re: accounting for memory used for BufFile during hash joins