Thread: Schema-qualified statements in pg_dump output
There's a behavior in pg_dump that annoyed me a little bit, the last few times i had to deal with it: Consider you have to dump a specific namespace only, you are going to use pg_dump -n <your_schema> [-t <tables>]. I found it a common use case to restore this dump into a different schema by simply changing the search_path. With included ownerships this doesn't work, since pg_dump always outputs the necessary DDL as follows: ALTER TABLE bernd.foo OWNER TO bernd; Okay, it isn't too hard to use sed to replace the necessary statements to use the correct schema, but i think it would be much nicer if pg_dump would omit the schema-qualified table name here. I'd like to create a patch for this, if we agree on changing this behavior? -- Thanks Bernd
Bernd Helmle <mailings@oopsware.de> writes: > I found it a common use case to restore this dump into a different schema > by simply changing the search_path. With included ownerships this doesn't > work, since pg_dump always outputs the necessary DDL as follows: > ALTER TABLE bernd.foo OWNER TO bernd; > Okay, it isn't too hard to use sed to replace the necessary statements to > use the correct schema, but i think it would be much nicer if pg_dump would > omit the schema-qualified table name here. I'd like to create a patch for > this, if we agree on changing this behavior? It seems like quite a useless change, since in general there will be other qualified references in the dump that can't safely be removed. IOW what you intend to do doesn't work anyway. regards, tom lane
--On Montag, Juli 07, 2008 10:33:35 -0400 Tom Lane <tgl@sss.pgh.pa.us> wrote: > It seems like quite a useless change, since in general there will be > other qualified references in the dump that can't safely be removed. > IOW what you intend to do doesn't work anyway. Hmm, If i want to restore just a bunch of tables into a different schema, all i need to do is to change the dump's search_path then and i don't have to bother with DDL statements having hardwired schema qualifications. So this seems a straight forward simplification to me. -- Thanks Bernd
Bernd Helmle <mailings@oopsware.de> writes: > --On Montag, Juli 07, 2008 10:33:35 -0400 Tom Lane <tgl@sss.pgh.pa.us> > wrote: >> It seems like quite a useless change, since in general there will be >> other qualified references in the dump that can't safely be removed. >> IOW what you intend to do doesn't work anyway. > Hmm, If i want to restore just a bunch of tables into a different schema, > all i need to do is to change the dump's search_path then You apparently aren't getting my point: no, that won't work. You have to be prepared to search-and-replace other references to the schema. The fact that the example you're currently looking at only has such occurrences in ALTER OWNER commands doesn't mean that that's the only place it can happen. regards, tom lane
Bernd Helmle wrote: > --On Montag, Juli 07, 2008 10:33:35 -0400 Tom Lane <tgl@sss.pgh.pa.us> > wrote: > >> It seems like quite a useless change, since in general there will be >> other qualified references in the dump that can't safely be removed. >> IOW what you intend to do doesn't work anyway. > > Hmm, If i want to restore just a bunch of tables into a different > schema, all i need to do is to change the dump's search_path then and > i don't have to bother with DDL statements having hardwired schema > qualifications. So this seems a straight forward simplification to me. > Why not restore into the original schema name and then rename it? If the schema already exists you could rename it temporarily and then rename it back after the restore. Or, as you originally noted, a simple sed filter on the text dump might work equally as well. I don't think in general we need to provide pg_dump with every possible permutation of uses that can achieved with the construction of simple tool chains. cheers andrew
--On Montag, Juli 07, 2008 11:09:56 -0400 Andrew Dunstan <andrew@dunslane.net> wrote: > I don't think in general we need to provide pg_dump with every possible > permutation of uses that can achieved with the construction of simple > tool chains. I always feel the same. However, i thought it would be better to ask wether this might be useful or not before forgetting this at all. -- Thanks Bernd
On Mon, 2008-07-07 at 15:46 +0200, Bernd Helmle wrote: > There's a behavior in pg_dump that annoyed me a little bit, the last few > times i had to deal with it: > > Consider you have to dump a specific namespace only, you are going to use > > pg_dump -n <your_schema> [-t <tables>]. > > I found it a common use case to restore this dump into a different schema > by simply changing the search_path. With included ownerships this doesn't > work, since pg_dump always outputs the necessary DDL as follows: > > ALTER TABLE bernd.foo OWNER TO bernd; > > Okay, it isn't too hard to use sed to replace the necessary statements to > use the correct schema, but i think it would be much nicer if pg_dump would > omit the schema-qualified table name here. I'd like to create a patch for > this, if we agree on changing this behavior? The use case you mention is something that would be of value to many people, and I support your efforts to add a new option for this. No useful workarounds exist without flaws: i) editing with sed might well end up editing character data in the table(s) at the same time and you may never even notice. ii) reloading to the same schema (renaming etc) is not acceptable if the target has a production schema of that name already. iii) manually editing a large file is problematic. Tom's posted comments that you need to look at all of the places the schemaname is used to see what we will need/not need to change. It's more than just altering the owner, but that doesn't mean we don't want it or its impossible. Please pursue this further. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
At 8:34 AM +0100 7/11/08, Simon Riggs wrote: >On Mon, 2008-07-07 at 15:46 +0200, Bernd Helmle wrote: >> There's a behavior in pg_dump that annoyed me a little bit, the last few >> times i had to deal with it: >> >> Consider you have to dump a specific namespace only, you are going to use >> >> pg_dump -n <your_schema> [-t <tables>]. >> >> I found it a common use case to restore this dump into a different schema >> by simply changing the search_path. With included ownerships this doesn't >> work, since pg_dump always outputs the necessary DDL as follows: >> >> ALTER TABLE bernd.foo OWNER TO bernd; >> >> Okay, it isn't too hard to use sed to replace the necessary statements to >> use the correct schema, but i think it would be much nicer if pg_dump would >> omit the schema-qualified table name here. I'd like to create a patch for >> this, if we agree on changing this behavior? > >The use case you mention is something that would be of value to many >people, and I support your efforts to add a new option for this. > >No useful workarounds exist without flaws: i) editing with sed might >well end up editing character data in the table(s) at the same time and >you may never even notice. ii) reloading to the same schema (renaming >etc) is not acceptable if the target has a production schema of that >name already. iii) manually editing a large file is problematic. > >Tom's posted comments that you need to look at all of the places the >schemaname is used to see what we will need/not need to change. It's >more than just altering the owner, but that doesn't mean we don't want >it or its impossible. > >Please pursue this further. I've been looking into this matter, although I'm a noob apropos PostgreSQL hacking. What I thought was a better way was to alter pg_dump to accept a flag -m <masquerade_name>. It would require the -n <schema_name> option or fail. It would generate a schema dump where all the references to <schema_name> were replaced by <masquerade_name>. This would allow you to easily make a copy of a schema into a new schema. My needs are that my production database is the "public" schema, and each year I want to archive "fy2007", "fy2008", etc. schemas which have the final information for those years. So at the end of this year, I want to duplicate the "public" schema into the "fy2008" schema, and continue with "public." I could do the pg_dump "public", rename "public" to "fy2008" and then restore "public," but this requires being without "public" for a short interval. It would be better for me to simply: pgsql database < pg_dump -c -n public -m fy2008 And that would give you a completely mechanical way to duplicate a schema, which means I could put it in a script that users could call. From what I've seen, it would mean finding where the schema is currently accessed in the code, then substituting on the -m flag. Having already done this with manually editing the files, it really cries out for a better procedure. Perhaps my solution is excessive compared to the other offered solution, but it would have the benefit that the user would know precisely what he was doing by the flag setting. -Owen
On Mon, 2008-07-21 at 23:53 -0400, Owen Hartnett wrote: > It would generate a schema dump where all the references to > <schema_name> were replaced by <masquerade_name>. Good idea, can I tweak that a bit? No need to specify the name at pg_dump time. For text files, just use an option to specify whether we change the actual schema name and replace it with the text :PGDUMPSCHEMA. pg_dump --relocateable-schema (or alternate option name) Then when we reload, we just run psql -f pgdump.file -v PGDUMPSCHEMA=newlocation -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Tue, 2008-07-22 at 16:58 +0100, Simon Riggs wrote: > On Mon, 2008-07-21 at 23:53 -0400, Owen Hartnett wrote: > No need to specify the name at pg_dump time. > > For text files, just use an option to specify whether we change the > actual schema name and replace it with the text :PGDUMPSCHEMA. > > pg_dump --relocateable-schema (or alternate option name) > > Then when we reload, we just run > > psql -f pgdump.file -v PGDUMPSCHEMA=newlocation I like the idea but would prefer no shell variable (I assume that is what you are using above). Why not just -X target-schema=newlocation or something like that? Joshua D. Drake -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
Simon Riggs <simon@2ndquadrant.com> writes: > No need to specify the name at pg_dump time. > For text files, just use an option to specify whether we change the > actual schema name and replace it with the text :PGDUMPSCHEMA. pg_restore is in even worse position than pg_dump to make this happen; it would not be able to do anything that's smarter than a sed-like substitution. I doubt that the original idea can be made to work, but this "improvement" will entirely guarantee failure. (Note: the problem is not so much with the names of the objects you're directly creating, as with object cross-references that're embedded in the DDL.) regards, tom lane
On Tue, 2008-07-22 at 13:35 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > No need to specify the name at pg_dump time. > > For text files, just use an option to specify whether we change the > > actual schema name and replace it with the text :PGDUMPSCHEMA. > > pg_restore is in even worse position than pg_dump to make this happen; > it would not be able to do anything that's smarter than a sed-like > substitution. Somebody just needs to check carefully to see what will work. I accept there is no easy option that is materially better than sed. I've screwed up a dump with sed, luckily noticed. I'm not playing Russian Roulette again. The chance of the schema name being stored somewhere in the database seems high, on reflection. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support