Re: backup manifests - Mailing list pgsql-hackers

From Robert Haas
Subject Re: backup manifests
Date
Msg-id CA+TgmoY+U30wCcxD6gLM9kFduMTGGwB1Ei06znLDqTqzcb+38A@mail.gmail.com
Whole thread Raw
In response to Re: backup manifests  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: backup manifests  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: backup manifests  (David Fetter <david@fetter.org>)
List pgsql-hackers
On Wed, Jan 1, 2020 at 7:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> David Fetter <david@fetter.org> writes:
> > On Wed, Jan 01, 2020 at 01:43:40PM -0500, Robert Haas wrote:
> >> So, if someone can suggest to me how I could read JSON from a tool in
> >> src/bin without writing a lot of code, I'm all ears.
>
> > Maybe I'm missing something obvious, but wouldn't combining
> > pg_read_file() with a cast to JSONB fix this, as below?
>
> Only if you're prepared to restrict the use of the tool to superusers
> (or at least people with whatever privilege that function requires).
>
> Admittedly, you can probably feed the data to the backend without
> use of an intermediate file; but it still requires a working backend
> connection, which might be a bit of a leap for backup-related tools.
> I'm sure Robert was envisioning doing this processing inside the tool.

Yeah, exactly. I don't think verifying a backup should require a
running server, let alone a running server on the same machine where
the backup is stored and for which you have superuser privileges.
AFAICS, the only options to make that work with JSON are (1) introduce
a new hand-coded JSON parser designed for frontend operation, (2) add
a dependency on an external JSON parser that we can use from frontend
code, or (3) adapt the existing JSON parser used in the backend so
that it can also be used in the frontend.

I'd be willing to do (1) -- it wouldn't be the first time I've written
JSON parser for PostgreSQL -- but I think it will take an order of
magnitude more code than using a file with tab-separated columns as
I've proposed, and I assume that there will be complaints about having
two JSON parsers in core. I'd also be willing to do (2) if that's the
consensus, but I'd vote against such an approach if somebody else
proposed it because (a) I'm not aware of a widely-available library
upon which we could depend and (b) introducing such a dependency for a
minor feature like this seems fairly unpalatable to me, and it'd
probably still be more code than just using a tab-separated file.  I'd
be willing to do (3) if somebody could explain to me how to solve the
problems with porting that code to work on the frontend side, but the
only suggestion so far as to how to do that is to port memory
contexts, elog/report, and presumably encoding handling to work on the
frontend side. That seems to me to be an unreasonably large lift,
especially given that we have lots of other files that use ad-hoc
formats already, and if somebody ever gets around to converting all of
those to JSON, they can certainly convert this one at the same time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: backup manifests
Next
From: Tom Lane
Date:
Subject: Re: backup manifests