Re: backup manifests - Mailing list pgsql-hackers

From Robert Haas
Subject Re: backup manifests
Date
Msg-id CA+TgmoZRTBiPyvQEwV79PU1ePTtSEo2UeVncrkJMbn1sU1gnRA@mail.gmail.com
Whole thread Raw
In response to Re: backup manifests  (Suraj Kharage <suraj.kharage@enterprisedb.com>)
Responses Re: backup manifests  (tushar <tushar.ahuja@enterprisedb.com>)
List pgsql-hackers
On Fri, Jan 3, 2020 at 6:11 PM Suraj Kharage
<suraj.kharage@enterprisedb.com> wrote:
> Thank you for review comments.

Here's a new patch set for this feature.

0001 adds checksum helper functions, similar to what Suraj had
incorporated into my original patch but separated out into a separate
patch and with some different aesthetic decisions. I also decided to
support all of the SHA variants that PG knows about as options and
added a function to parse a checksum algorithm name, along the lines I
suggested previously.

0002 teaches the server to generate a backup manifest using the format
I originally proposed. This is similar to the patch I posted
previously, but it spools the manifest to disk as it's being
generated, so that we don't run the server out of memory or fail when
hitting the 1GB allocation limit.

0003 adds a new utility, pg_validatebackup, to validate a backup
against a manifest. Suraj tried to incorporate this into
pg_basebackup, which I initially thought might be OK but eventually
decided wasn't good, partly because this really wants to take some
command-line options entirely unrelated to the options accepted by
pg_basebackup. I tried to improve the error checking and the order in
which various things are done, too. This is a basically a complete
rewrite as compared with Suraj's version.

0004 modifies the server to generate a backup manifest in JSON format
rather than my originally proposed format. This allows for some
comparison of the code doing it one way vs. the other. Assuming we
stick with JSON, I will squash this with 0002 at some point.

0005 is a very much work-in-progress and proof-of-concept to modify
the backup validator to understand the JSON format. It doesn't
validate the manifest checksum at this point; it just prints it out.
The error handling needs work. It has other problems, and bugs.
Although I'm still not very happy about the idea of using JSON here,
I'm pretty happy with the basic approach this patch takes. It
demonstrates that the JSON parser can be used for non-trivial things
in frontend code, and I'd say the code even looks reasonably clean -
with the exception of small details like being buggy and
under-commented.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Alexey Kondratov
Date:
Subject: Re: [Patch] pg_rewind: options to use restore_command fromrecovery.conf or command line
Next
From: Nikita Glukhov
Date:
Subject: jsonpath syntax extensions