Re: where should I stick that backup? - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: where should I stick that backup? |
Date | |
Msg-id | CA+TgmoZFsXfwdbL39tO7x4GUeYFsMWw-5aMVysP6DFrybb9Ruw@mail.gmail.com Whole thread Raw |
In response to | Re: where should I stick that backup? (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: where should I stick that backup?
Re: where should I stick that backup? |
List | pgsql-hackers |
On Fri, Apr 10, 2020 at 10:54 AM Stephen Frost <sfrost@snowman.net> wrote: > So, this goes to what I was just mentioning to Bruce independently- you > could have made the same argument about FDWs, but it just doesn't > actually hold any water. Sure, some of the FDWs aren't great, but > there's certainly no shortage of them, and the ones that are > particularly important (like postgres_fdw) are well written and in core. That's a fairly different use case. In the case of the FDW interface: - The number of interface method calls is very high, at least one per tuple and a bunch of extra ones for each query. - There is a significant amount of complex state that needs to be maintained across API calls. - The return values are often tuples, which are themselves an in-memory data structure. But here: - We're only talking about writing a handful of tar files, and that's in the context of a full-database backup, which is a much heavier-weight operation than a query. - There is not really any state that needs to be maintained across calls. - The expected result is that a file gets written someplace, which is not an in-memory data structure but something that gets written to a place outside of PostgreSQL. > The concerns about there being too many possibilities and new ones > coming up all the time could be applied equally to FDWs, but rather than > ending up with a dearth of options and external solutions there, what > we've actually seen is an explosion of options and externally written > libraries for a large variety of options. Sure, but a lot of those FDWs are relatively low-quality, and it's often hard to find one that does what you want. And even if you do, you don't really know how good it is. Unfortunately, in that case there's no real alternative, because implementing something based on shell commands couldn't ever have reasonable performance or a halfway decent feature set. That's not the case here. > How does this solution give them a good way to do the right thing > though? In a way that will work with large databases and complex > requirements? The answer seems to be "well, everyone will have to write > their own tool to do that" and that basically means that, at best, we're > only providing half of a solution and expecting all of our users to > provide the other half, and to always do it correctly and in a well > written way. Acknowledging that most users aren't going to actually do > that and instead they'll implement half measures that aren't reliable > shouldn't be seen as an endorsement of this approach. I don't acknowledge that. I think it's possible to use tools like the proposed option in a perfectly reliable way, and I've already given a bunch of examples of how it could be done. Writing a file is not such a complex operation that every bit of code that writes one reliably has to be written by someone associated with the PostgreSQL project. I strongly suspect that people who use a cloud provider's tools to upload their backup files will be quite happy with the results, and if they aren't, I hope they will blame the cloud provider's tool for eating the data rather than this option for making it easy to give the data to the thing that ate it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: