Thread: Schema-qualified statements in pg_dump output

Schema-qualified statements in pg_dump output

From
Bernd Helmle
Date:
There's a behavior in pg_dump that annoyed me a little bit, the last few 
times i had to deal with it:

Consider you have to dump a specific namespace only, you are going to use

pg_dump -n <your_schema> [-t <tables>].

I found it a common use case to restore this dump into a different schema 
by simply changing the search_path. With included ownerships this doesn't 
work, since pg_dump always outputs the necessary DDL as follows:

ALTER TABLE bernd.foo OWNER TO bernd;

Okay, it isn't too hard to use sed to replace the necessary statements to 
use the correct schema, but i think it would be much nicer if pg_dump would 
omit the schema-qualified table name here. I'd like to create a patch for 
this, if we agree on changing this behavior?

--  Thanks
                   Bernd


Re: Schema-qualified statements in pg_dump output

From
Tom Lane
Date:
Bernd Helmle <mailings@oopsware.de> writes:
> I found it a common use case to restore this dump into a different schema 
> by simply changing the search_path. With included ownerships this doesn't 
> work, since pg_dump always outputs the necessary DDL as follows:

> ALTER TABLE bernd.foo OWNER TO bernd;

> Okay, it isn't too hard to use sed to replace the necessary statements to 
> use the correct schema, but i think it would be much nicer if pg_dump would 
> omit the schema-qualified table name here. I'd like to create a patch for 
> this, if we agree on changing this behavior?

It seems like quite a useless change, since in general there will be
other qualified references in the dump that can't safely be removed.
IOW what you intend to do doesn't work anyway.
        regards, tom lane


Re: Schema-qualified statements in pg_dump output

From
Bernd Helmle
Date:
--On Montag, Juli 07, 2008 10:33:35 -0400 Tom Lane <tgl@sss.pgh.pa.us> 
wrote:

> It seems like quite a useless change, since in general there will be
> other qualified references in the dump that can't safely be removed.
> IOW what you intend to do doesn't work anyway.

Hmm, If i want to restore just a bunch of tables into a different schema, 
all i need to do is to change the dump's search_path then and i don't have 
to bother with DDL statements having hardwired schema qualifications. So 
this seems a straight forward simplification to me.

--  Thanks
                   Bernd


Re: Schema-qualified statements in pg_dump output

From
Tom Lane
Date:
Bernd Helmle <mailings@oopsware.de> writes:
> --On Montag, Juli 07, 2008 10:33:35 -0400 Tom Lane <tgl@sss.pgh.pa.us> 
> wrote:
>> It seems like quite a useless change, since in general there will be
>> other qualified references in the dump that can't safely be removed.
>> IOW what you intend to do doesn't work anyway.

> Hmm, If i want to restore just a bunch of tables into a different schema, 
> all i need to do is to change the dump's search_path then

You apparently aren't getting my point: no, that won't work.  You have
to be prepared to search-and-replace other references to the schema.
The fact that the example you're currently looking at only has such
occurrences in ALTER OWNER commands doesn't mean that that's the only
place it can happen.
        regards, tom lane


Re: Schema-qualified statements in pg_dump output

From
Andrew Dunstan
Date:

Bernd Helmle wrote:
> --On Montag, Juli 07, 2008 10:33:35 -0400 Tom Lane <tgl@sss.pgh.pa.us> 
> wrote:
>
>> It seems like quite a useless change, since in general there will be
>> other qualified references in the dump that can't safely be removed.
>> IOW what you intend to do doesn't work anyway.
>
> Hmm, If i want to restore just a bunch of tables into a different 
> schema, all i need to do is to change the dump's search_path then and 
> i don't have to bother with DDL statements having hardwired schema 
> qualifications. So this seems a straight forward simplification to me.
>

Why not restore into the original schema name and then rename it? If the 
schema already exists you could rename it temporarily and then rename it 
back after the restore.

Or, as you originally noted, a simple sed filter on the text dump might 
work equally as well.

I don't think in general we need to provide pg_dump with every possible 
permutation of uses that can achieved with the construction of simple 
tool chains.

cheers

andrew


Re: Schema-qualified statements in pg_dump output

From
Bernd Helmle
Date:
--On Montag, Juli 07, 2008 11:09:56 -0400 Andrew Dunstan 
<andrew@dunslane.net> wrote:

> I don't think in general we need to provide pg_dump with every possible
> permutation of uses that can achieved with the construction of simple
> tool chains.

I always feel the same. However, i thought it would be better to ask wether 
this might be useful or not before forgetting this at all.

--  Thanks
                   Bernd


Re: Schema-qualified statements in pg_dump output

From
Simon Riggs
Date:
On Mon, 2008-07-07 at 15:46 +0200, Bernd Helmle wrote:
> There's a behavior in pg_dump that annoyed me a little bit, the last few 
> times i had to deal with it:
> 
> Consider you have to dump a specific namespace only, you are going to use
> 
> pg_dump -n <your_schema> [-t <tables>].
> 
> I found it a common use case to restore this dump into a different schema 
> by simply changing the search_path. With included ownerships this doesn't 
> work, since pg_dump always outputs the necessary DDL as follows:
> 
> ALTER TABLE bernd.foo OWNER TO bernd;
> 
> Okay, it isn't too hard to use sed to replace the necessary statements to 
> use the correct schema, but i think it would be much nicer if pg_dump would 
> omit the schema-qualified table name here. I'd like to create a patch for 
> this, if we agree on changing this behavior?

The use case you mention is something that would be of value to many
people, and I support your efforts to add a new option for this.

No useful workarounds exist without flaws: i) editing with sed might
well end up editing character data in the table(s) at the same time and
you may never even notice. ii) reloading to the same schema (renaming
etc) is not acceptable if the target has a production schema of that
name already. iii) manually editing a large file is problematic.

Tom's posted comments that you need to look at all of the places the
schemaname is used to see what we will need/not need to change. It's
more than just altering the owner, but that doesn't mean we don't want
it or its impossible.

Please pursue this further.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Schema-qualified statements in pg_dump output

From
Owen Hartnett
Date:
At 8:34 AM +0100 7/11/08, Simon Riggs wrote:
>On Mon, 2008-07-07 at 15:46 +0200, Bernd Helmle wrote:
>>  There's a behavior in pg_dump that annoyed me a little bit, the last few
>>  times i had to deal with it:
>>
>>  Consider you have to dump a specific namespace only, you are going to use
>>
>>  pg_dump -n <your_schema> [-t <tables>].
>>
>>  I found it a common use case to restore this dump into a different schema
>>  by simply changing the search_path. With included ownerships this doesn't
>>  work, since pg_dump always outputs the necessary DDL as follows:
>>
>>  ALTER TABLE bernd.foo OWNER TO bernd;
>>
>>  Okay, it isn't too hard to use sed to replace the necessary statements to
>>  use the correct schema, but i think it would be much nicer if pg_dump would
>>  omit the schema-qualified table name here. I'd like to create a patch for
>>  this, if we agree on changing this behavior?
>
>The use case you mention is something that would be of value to many
>people, and I support your efforts to add a new option for this.
>
>No useful workarounds exist without flaws: i) editing with sed might
>well end up editing character data in the table(s) at the same time and
>you may never even notice. ii) reloading to the same schema (renaming
>etc) is not acceptable if the target has a production schema of that
>name already. iii) manually editing a large file is problematic.
>
>Tom's posted comments that you need to look at all of the places the
>schemaname is used to see what we will need/not need to change. It's
>more than just altering the owner, but that doesn't mean we don't want
>it or its impossible.
>
>Please pursue this further.

I've been looking into this matter, although I'm a noob apropos 
PostgreSQL hacking.  What I thought was a better way was to alter 
pg_dump to accept a flag -m <masquerade_name>.  It would require the 
-n <schema_name> option or fail.

It would generate a schema dump where all the references to 
<schema_name> were replaced by <masquerade_name>.

This would allow you to easily make a copy of a schema into a new schema.

My needs are that my production database is the "public" schema, and 
each year I want to archive "fy2007", "fy2008", etc. schemas which 
have the final information for those years.  So at the end of this 
year, I want to duplicate the "public" schema into the "fy2008" 
schema, and continue with "public."

I could do the pg_dump "public", rename "public" to "fy2008" and then 
restore "public," but this requires being without "public" for a 
short interval.  It would be better for me to simply:

pgsql database < pg_dump -c -n public -m fy2008

And that would give you a completely mechanical way to duplicate a 
schema, which means I could put it in a script that users could call.
From what I've seen, it would mean finding where the schema is 
currently accessed in the code, then substituting on the -m flag.

Having already done this with manually editing the files, it really 
cries out for a better procedure.

Perhaps my solution is excessive compared to the other offered 
solution, but it would have the benefit that the user would know 
precisely what he was doing by the flag setting.

-Owen


Re: Schema-qualified statements in pg_dump output

From
Simon Riggs
Date:
On Mon, 2008-07-21 at 23:53 -0400, Owen Hartnett wrote:

> It would generate a schema dump where all the references to 
> <schema_name> were replaced by <masquerade_name>.

Good idea, can I tweak that a bit?

No need to specify the name at pg_dump time.

For text files, just use an option to specify whether we change the
actual schema name and replace it with the text :PGDUMPSCHEMA.
pg_dump --relocateable-schema (or alternate option name)

Then when we reload, we just run
psql -f pgdump.file -v PGDUMPSCHEMA=newlocation

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Schema-qualified statements in pg_dump output

From
"Joshua D. Drake"
Date:
On Tue, 2008-07-22 at 16:58 +0100, Simon Riggs wrote:
> On Mon, 2008-07-21 at 23:53 -0400, Owen Hartnett wrote:

> No need to specify the name at pg_dump time.
> 
> For text files, just use an option to specify whether we change the
> actual schema name and replace it with the text :PGDUMPSCHEMA.
> 
>     pg_dump --relocateable-schema (or alternate option name)
> 
> Then when we reload, we just run
> 
>     psql -f pgdump.file -v PGDUMPSCHEMA=newlocation


I like the idea but would prefer no shell variable (I assume that is
what you are using above). Why not just -X target-schema=newlocation
or something like that?

Joshua D. Drake

-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate





Re: Schema-qualified statements in pg_dump output

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> No need to specify the name at pg_dump time.
> For text files, just use an option to specify whether we change the
> actual schema name and replace it with the text :PGDUMPSCHEMA.

pg_restore is in even worse position than pg_dump to make this happen;
it would not be able to do anything that's smarter than a sed-like
substitution.

I doubt that the original idea can be made to work, but this
"improvement" will entirely guarantee failure.

(Note: the problem is not so much with the names of the objects you're
directly creating, as with object cross-references that're embedded in
the DDL.)
        regards, tom lane


Re: Schema-qualified statements in pg_dump output

From
Simon Riggs
Date:
On Tue, 2008-07-22 at 13:35 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > No need to specify the name at pg_dump time.
> > For text files, just use an option to specify whether we change the
> > actual schema name and replace it with the text :PGDUMPSCHEMA.
> 
> pg_restore is in even worse position than pg_dump to make this happen;
> it would not be able to do anything that's smarter than a sed-like
> substitution.

Somebody just needs to check carefully to see what will work. I accept
there is no easy option that is materially better than sed.

I've screwed up a dump with sed, luckily noticed. I'm not playing
Russian Roulette again. The chance of the schema name being stored
somewhere in the database seems high, on reflection.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support