Re: use pg_get_functiondef() in pg_dump - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: use pg_get_functiondef() in pg_dump |
Date | |
Msg-id | 1260763.1597770183@sss.pgh.pa.us Whole thread Raw |
In response to | Re: use pg_get_functiondef() in pg_dump (Stephen Frost <sfrost@snowman.net>) |
List | pgsql-hackers |
Stephen Frost <sfrost@snowman.net> writes: > So.. the code that's in pg_dump today works to go from "whatever the > connected server's version is" to "whatever the version is of the > pg_dump command itself". If we had the code in libpgcommon, and > functions in the backend to get at it along with psql having that code, > you could then, using the code we have today, go from a bunch of > 'source' versions to 'target' version of either the version of the psql > command, or that of the server. At this point, I think I need a high-power telescope even to see the goalposts :-( If we actually want to do something like this, we need a plan not just some handwaving. Let's start by enumerating the concerns that would have to be solved. I can think of: * Execution context. Stephen seems to be envisioning code that could be compiled into the backend not just the frontend, but is that really worth the trouble? Could we share such code across FE/BE at all (it'd certainly be a far more ambitious exercise in common code than we've done to date)? What's the backend version actually doing, issuing queries over SPI? (I suppose if you were rigid about that, it could offer a guarantee that the results match your snapshot, which is pretty attractive.) * Global vs. per-object activity. pg_dump likes to query the entire state of the database to start with, and then follow up by grabbing additional details about objects it's going to dump. That's not an operating mode that most other clients would want, but if for no other reason than performance, I don't think we can walk away from it for pg_dump --- indeed, I think pg_dump probably needs to be fixed to do less per-object querying, not more. Meanwhile applications such as psql \d would only want to investigate one object at a time. What design can we create that will handle that? If there is persistent state involved, what in the world does that mean for the case of a backend-side library? * Context in which the output is valid. Target server version was already mentioned, but a quick examination of pg_dump output scripts will remind you that there's a bunch more assumptions: SET statement_timeout = 0; SET lock_timeout = 0; SET idle_in_transaction_session_timeout = 0; SET client_encoding = 'en_US.utf8'; SET standard_conforming_strings = on; SELECT pg_catalog.set_config('search_path', '', false); SET check_function_bodies = false; SET xmloption = content; SET client_min_messages = warning; SET row_security = off; not to mention special hackery for object ownership and tablespaces. Some of these things probably don't matter for other use-cases, but others definitely do. In particular, I really doubt that psql and other clients would find it acceptable to force search_path to a particular thing. Which brings us to * Security. How robust do the output commands need to be, and what will we have to do that pg_dump doesn't need to? * No doubt there are some other topics I didn't think of. This certainly would be attractive if we had it, but the task seems dauntingly large. It's not going to happen without some fairly serious investment of time. regards, tom lane
pgsql-hackers by date: