Re: where should I stick that backup? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: where should I stick that backup?
Date
Msg-id 20200413002750.sd3k2s3cwcgbvqam@alap3.anarazel.de
Whole thread Raw
In response to Re: where should I stick that backup?  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: where should I stick that backup?  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

On 2020-04-12 20:02:50 -0400, Robert Haas wrote:
> On Sun, Apr 12, 2020 at 3:17 PM Andres Freund <andres@anarazel.de> wrote:
> > A huge advantage of a scheme like this would be that it wouldn't have to
> > be specific to pg_basebackup. It could just as well work directly on the
> > server, avoiding an unnecesary loop through the network. Which
> > e.g. could integrate with filesystem snapshots etc.  Without needing to
> > build the 'archive target' once with server libraries, and once with
> > client libraries.
> 
> That's quite appealing. One downside - IMHO significant - is that you
> have to have a separate process to do *anything*. If you want to add a
> filter that just logs everything it's asked to do, for example, you've
> gotta have a whole process for that, which likely adds a lot of
> overhead even if you can somehow avoid passing all the data through an
> extra set of pipes. The interface I proposed would allow you to inject
> very lightweight filters at very low cost. This design really doesn't.

Well, in what you described it'd still be all done inside pg_basebackup,
or did I misunderstand? Once you fetched it from the server, I can't
imagine the overhead of filtering it a bit differently would matter.

But even if, the "target" could just reply with "skip" or such, instead
of providing an fd.

What kind of filtering are you thinking of where this is a problem?
Besides just logging the filenames?  I just can't imagine how that's a
relevant overhead compared to having to do things like
'shell ssh rhaas@depository pgfile create-exclusive - %f.lz4'


I really think we want the option to eventually do this server-side. And
I don't quite see it as viable to go for an API that allows to specify
shell fragments that are going to be executed server side.


> Note that you could build this on top of what I proposed, but not the
> other way around.

Why should it not be possible the other way round?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: where should I stick that backup?
Next
From: Justin Pryzby
Date:
Subject: Re: sqlsmith crash incremental sort