Re: pg_dump as a bunch of PostgreSQL functions - Mailing list pgsql-hackers
From | Mark Gibson |
---|---|
Subject | Re: pg_dump as a bunch of PostgreSQL functions |
Date | |
Msg-id | a5c4cb4bff23489c818762a380dbda3341499af9@cromwell.co.uk Whole thread Raw |
In response to | Re: pg_dump as a bunch of PostgreSQL functions (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
Mark Gibson <gibsonm@cromwell.co.uk> writes: >>I have an idea, to break pg_dump into functions within PostgreSQL. Philip Warner wrote: > However, there are some complications because pg_dump is also the > upgrade tool; the backed can only know how to describe itself for the > current dialect of SQL accepted by PG. As we upgrade and improve the > SQL, and add features, pg_dump needs to talk to old backends and dump > prior versions in a format compatible with current (new) versions. This > means that for some purposes it will not be able to use backend > functions, or at least will have to have it's own mutant version of them. Tom Lane wrote: > This strikes me as largely useless, because the problems that are > actually hard for pg_dump are not at the level of individual objects; > they have to do with problems like determining a safe dump order and > altering definitions to break circularities in the dump requirements. > I don't think that extending the pg_get_xxx family of functions would > make pg_dump's life easier by any measurable amount. I was thinking really of very low-level dumping functions, that just dump simple single statements. Its mainly for development purposes, we generally develop db apps in three or four stages: 1, In a developer only db, each developer has a private schema to experiment in (prototypes, proof-of-concepts, alpha stage, etc). 2, Then its moved into its own schema on the same developer server for further stabilization (alpha/beta stage). 3, Once it reaches beta, it moves onto a mirror of the production server, for further testing by users (beta/pre-release stage). 4, For release, it's moved onto a production server for general consumption. So there is lots of dumping and reloading of schemas, the only way to do it is by using pg_dump and then hand editing the dump (or with sed/awk scripts etc.), which is a bit of a pain. I was thinking that if all the dumping functionality was available as PostgreSQL functions, then a simple PL/pgSQL function could be written to dump exactly what is required, and using dblink, fed directly from one db to another. Nice :) Dependency functions could be developed to determine the ideal dump order for higher-level dumping, and would also be useful for displaying dependency graphs in admin interfaces (phpPgAdmin, pgAdmin III, etc.). > There is also a fundamental problem with the current pg_get_xxx > functions, which is that they rely on backend-internal catalog > operations that generally use SnapshotNow semantics. This is not what > we want to guarantee that pg_dump dumps a consistent snapshot --- we > need to look at catalog rows that existed as of pg_dump's serializable > snapshot, instead. We have gotten away with it so far because pg_dump > starts by taking read locks on every table in sight, and that is > sufficient to block schema changes on the tables. But extending the > pg_get_xxx approach to non-table-related objects would be seriously > dangerous. (I think pg_get_viewdef is already broken, actually, > since you can't lock a view.) So, are pg_catalog tables subject to MVCC in the same way as user tables? ie. If I BEGIN a transaction, will the pg_catalog data remain consistent thoughout the transaction regardless of any DDL commands in a parallel session? Philip Warner wrote:> Perhaps it would be nice if, in each new version we created a library> that could be built againstold versions to provide the functions needed> by pg_dump to upgrade, and a similar library would form part of thenew> version as well. Kind of a 'pg_dump translation plugin'. This may be way> too expensive an option, when a few 'if'statements inside pg_dump will> achieve almost the same result. It would remove/reduce bloat in pg_dump> and make thefunctions available more generally, at the expense of> duplicating lots of code for each supported version. I was thinking of sharing a library (or just code base) between pg_dump and built-in functions, so that pg_dump doesn't have to rely on old built-in functions of a db you are trying to upgrade. -- Mark Gibson <gibsonm |AT| cromwell |DOT| co |DOT| uk> Web Developer & Database Admin Cromwell Tools Ltd. Leicester, England.
pgsql-hackers by date: