Re: where should I stick that backup? - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: where should I stick that backup? |
Date | |
Msg-id | CA+TgmobU1gqsEuoPeB-TKKiwM1sEqkFJttKYzfxt1rcM765RGw@mail.gmail.com Whole thread Raw |
In response to | Re: where should I stick that backup? (Stephen Frost <sfrost@snowman.net>) |
List | pgsql-hackers |
On Sun, Apr 12, 2020 at 9:18 PM Stephen Frost <sfrost@snowman.net> wrote: > There's two different questions we're talking about here and I feel like > they're being conflated. To try and clarify: > > - Could you implement FDWs with shell scripts, and custom programs? I'm > pretty confident that the answer is yes, but the thrust of that > argument is primarily to show that you *can* implement just about > anything using a shell script "API", so just saying it's possible to > do doesn't make it necessarily a good solution. The FDW system is > complicated, and also good, because we made it so and because it's > possible to do more sophisticated things with a C API, but it could > have started out with shell scripts that just returned data in much > the same way that COPY PROGRAM works today. What matters is that > forward thinking to consider what you're going to want to do tomorrow, > not just thinking about how you can solve for the simple cases today > with a shell out to an existing command. > > - Does providing a C-library interface deter people from implementing > solutions that use that interface? Perhaps it does, but it doesn't > have nearly the dampening effect that is being portrayed here, and we > can see that pretty clearly from the FDW situation. Sure, not all of > those are good solutions, but lots and lots of archive command shell > scripts are also pretty terrible, and there *are* a few good solutions > out there, including the ones that we ourselves ship. At least when > it comes to FDWs, there's an option there for us to ship a *good* > answer ourselves for certain (and, in particular, the very very > common) use-cases. > > > - We're only talking about writing a handful of tar files, and that's > > in the context of a full-database backup, which is a much > > heavier-weight operation than a query. > > This is true for -Ft, but not -Fp, and I don't think there's enough > thought being put into this when it comes to parallelism and that you > don't want to be limited to one process per tablespace. > > > - There is not really any state that needs to be maintained across calls. > > As mentioned elsewhere, this isn't really true. These are fair points, and my thinking has been somewhat refined by this discussion, so let me try to clarify my (current) position a bit. I believe that there are two subtly different questions here. Question #1 is "Would it be useful to people to be able to pipe the tar files that they get from pg_basebackup into some other command rather than writing them to the filesystem, and should we give them the option to do so?" Question #2 is "Is piping the tar files that pg_basebackup would produce into some other program the best possible way of providing more flexibility about where backups get written?" I'm prepared to concede that the answer to question #2 is no. I had earlier assumed that establishing connections was pretty fast and that, even if not, there were solutions to that problem, like setting up an SSH tunnel in advance. Several people have said, well, no, establishing connections is a problem. As I acknowledged from the beginning, plain format backups are a problem. So I think a convincing argument has been made that a shell command won't meet everyone's needs, and a more complex API is required for some cases. But I still think the answer to question #1 is yes. I disagree entirely with any argument to the effect that because some users might do unsafe things with the option, we ought not to provide it. Practically speaking, it would work fine for many people even with no other changes, and if we add something like pgfile, which I'm willing to do, it would work for more people in more situations. It is a useful thing to have, full stop. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: