Re: where should I stick that backup? - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: where should I stick that backup?
Date
Msg-id 20200410135116.GW13712@tamriel.snowman.net
Whole thread Raw
In response to Re: where should I stick that backup?  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
Greetings,

* Bruce Momjian (bruce@momjian.us) wrote:
> On Thu, Apr  9, 2020 at 04:15:07PM -0400, Stephen Frost wrote:
> > * Bruce Momjian (bruce@momjian.us) wrote:
> > > I think we need to step back and look at the larger issue.  The real
> > > argument goes back to the Unix command-line API vs the VMS/Windows API.
> > > The former has discrete parts that can be stitched together, while the
> > > VMS/Windows API presents a more duplicative but more holistic API for
> > > every piece.  We have discussed using shell commands for
> > > archive_command, and even more recently, for the server pass phrase.
> >
> > When it comes to something like the server pass phrase, it seems much
> > more reasonable to consider using a shell script (though still perhaps
> > not ideal) because it's not involved directly in ensuring that the data
> > is reliably stored and it's pretty clear that if it doesn't work the
> > worst thing that happens is that the database doesn't start up, but it
> > won't corrupt any data or destroy it or do other bad things.
>
> Well, the pass phrase relates to security, so it is important too.  I
> don't think the _importance_ of the action is the most determining
> issue.  Rather, I think it is how well the action fits the shell script
> API.

There isn't a single 'shell script API' though, and it's possible to
craft a 'shell script API' to fit nearly any use-case, but that doesn't
make it a good solution.  The amount we depend on the external code for
the correct operation of the system is relevant, and important to
consider.

> > > To get more specific, I think we have to understand how the
> > > _requirements_ of the job match the shell script API, with stdin,
> > > stdout, stderr, return code, and command-line arguments.  Looking at
> > > archive_command, the command-line arguments allow specification of file
> > > names, but quoting can be complex.  The error return code and stderr
> > > output seem to work fine.  There is no clean API for fsync and testing
> > > if the file exists, so that all that has to be hand done in one
> > > command-line.  This is why many users use pre-written archive_command
> > > shell scripts.
> >
> > We aren't considering all of the use-cases really though, in specific,
> > things like pushing to s3 or gcs require, at least, good retry logic,
> > and that's without starting to think about things like high-rate systems
> > (spawning lots of new processes isn't free, particularly if they're
> > written in shell script but any interpreted language is expensive) and
> > wanting to parallelize.
>
> Good point, but if there are multiple APIs, it makes shell script
> flexibility even more useful.

This doesn't seem to answer the concerns that I brought up.

Trying to understand it did make me think of another relevant question
that was brought up in this discussion- can we really expect users to
actually implement a C library for this, if we provided a way for them
to?  For that, I'd point to FDWs, where we certainly don't have any
shortage of external, written in C, solutions.  Another would be logical
decoding.

> > > This brings up a few questions:
> > >
> > > *  Should we have split apart archive_command into file-exists, copy,
> > > fsync-file?  Should we add that now?
> >
> > No..  The right approach to improving on archive command is to add a way
> > for an extension to take over that job, maybe with a complete background
> > worker of its own, or perhaps a shared library that can be loaded by the
> > archiver process, at least if we're talking about how to allow people to
> > extend it.
>
> That seems quite vague, which is the issue we had years ago when
> considering doing archive_command as a link to a C library.

That prior discussion isn't really relevant though, as it was before we
had extensions, and before we had background workers that can run as part
of an extension.

> > Potentially a better answer is to just build this stuff into PG- things
> > like "archive WAL to s3/GCS with these credentials" are what an awful
> > lot of users want.  There's then some who want "archive first to this
> > other server, and then archive to s3/GCS", or more complex options.
>
> Yes, we certainly know how to do a file system copy, but what about
> copying files to other things like S3?  I don't know how we would do
> that and allow users to change things like file paths or URLs.

There's a few different ways we could go about this.  The simple answer
would be to use GUCs, which would simplify things like dealing with the
restore side too.  Another option would be to have a concept of
'repository' objects in the system, not unlike tablespaces, but they'd
have more options.  To deal with that during recovery though, we'd need
a way to get the relevant information from the catalogs (maybe we write
the catalog out to a flat file on update, not unlike what we used to do
with pg_shadow), perhaps even in a format that users could modify if
they needed to.  The nice thing about having actual objects in the
system is that it'd be a bit cleaner to be able to define multiple ones
and then have SQL-level functions/commands that work with them.

A good deal of this does involve the question about how to deal with
recovery though, since you might want to, or need to, use different
options when it comes to recovery.  Back to the use-case that I was
mentioning, you could certainly want something like "try to get the WAL
from the local archive, and if that doesn't work, try to get it from the
s3 repo".  What that implies then is that you'd really like a way to
configure multiple repos, which is where we start to see the fragility
of our GUC system.  Pushing that out to something external doesn't
strike me as the right answer though, but rather, we should think about
how to resolve these issues with the GUC system, or come up with
something better.  This isn't the only area where the GUC system isn't
really helping us- synchronous standby names is getting to be a pretty
complicated GUC, for example.

Of course, we could start out with just supporting a single repo with
just a few new GUCs to configure it, that wouldn't be hard and there's
good examples out there about what's needed to configure an s3 repo.

Thanks,

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: where should I stick that backup?
Next
From: Robert Haas
Date:
Subject: Re: Fast DSM segments