Thread: pg_dump additional options for performance

pg_dump additional options for performance

From
Stephen Frost
Date:
Simon,

  I agree with adding these options in general, since I find myself
  frustrated by having to vi huge dumps to change simple schema things.
  A couple of comments on the patch though:

  - Conflicting option handling
    I think we are doing our users a disservice by putting it on them to
    figure out exactly what:
    multiple object groups cannot be used together
    means to them.  You and I may understand what an "object group" is,
    and why there can be only one, but it's a great deal less clear than
    the prior message of
    options -s/--schema-only and -a/--data-only cannot be used together
    My suggestion would be to either list out the specific options which
    can't be used together, as was done previously, or add a bit of (I
    realize, boring) code and actually tell the user which of the
    conflicting options were used.

  - Documentation
    When writing the documentation I would stress that "pre-schema" and
    "post-schema" be defined in terms of PostgreSQL objects and why they
    are pre vs. post.

  - Technically, the patch needs to be updated slightly since another
    pg_dump-related patch was committed recently which also added
    options and thus causes a conflict.

  Beyond those minor points, the patch looks good to me.

      Thanks,

        Stephen

Attachment

Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Sat, 2008-07-19 at 23:07 -0400, Stephen Frost wrote:
> Simon,
>
>   I agree with adding these options in general, since I find myself
>   frustrated by having to vi huge dumps to change simple schema things.
>   A couple of comments on the patch though:
>
>   - Conflicting option handling
>     I think we are doing our users a disservice by putting it on them to
>     figure out exactly what:
>     multiple object groups cannot be used together
>     means to them.  You and I may understand what an "object group" is,
>     and why there can be only one, but it's a great deal less clear than
>     the prior message of
>     options -s/--schema-only and -a/--data-only cannot be used together
>     My suggestion would be to either list out the specific options which
>     can't be used together, as was done previously, or add a bit of (I
>     realize, boring) code and actually tell the user which of the
>     conflicting options were used.
>
>   - Documentation
>     When writing the documentation I would stress that "pre-schema" and
>     "post-schema" be defined in terms of PostgreSQL objects and why they
>     are pre vs. post.
>
>   - Technically, the patch needs to be updated slightly since another
>     pg_dump-related patch was committed recently which also added
>     options and thus causes a conflict.
>
>   Beyond those minor points, the patch looks good to me.

Thanks for the review. I'll make the changes you suggest.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Sun, 2008-07-20 at 05:47 +0100, Simon Riggs wrote:
> On Sat, 2008-07-19 at 23:07 -0400, Stephen Frost wrote:
> > Simon,
> >
> >   I agree with adding these options in general, since I find myself
> >   frustrated by having to vi huge dumps to change simple schema things.
> >   A couple of comments on the patch though:
> >
> >   - Conflicting option handling
> >     I think we are doing our users a disservice by putting it on them to
> >     figure out exactly what:
> >     multiple object groups cannot be used together
> >     means to them.  You and I may understand what an "object group" is,
> >     and why there can be only one, but it's a great deal less clear than
> >     the prior message of
> >     options -s/--schema-only and -a/--data-only cannot be used together
> >     My suggestion would be to either list out the specific options which
> >     can't be used together, as was done previously, or add a bit of (I
> >     realize, boring) code and actually tell the user which of the
> >     conflicting options were used.
> >
> >   - Documentation
> >     When writing the documentation I would stress that "pre-schema" and
> >     "post-schema" be defined in terms of PostgreSQL objects and why they
> >     are pre vs. post.
> >
> >   - Technically, the patch needs to be updated slightly since another
> >     pg_dump-related patch was committed recently which also added
> >     options and thus causes a conflict.
> >
> >   Beyond those minor points, the patch looks good to me.
>
> Thanks for the review. I'll make the changes you suggest.

Patch updated to head, plus changes/docs requested.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

Attachment

Re: pg_dump additional options for performance

From
Stephen Frost
Date:
Simon,

* Simon Riggs (simon@2ndquadrant.com) wrote:
> On Sun, 2008-07-20 at 05:47 +0100, Simon Riggs wrote:
> > On Sat, 2008-07-19 at 23:07 -0400, Stephen Frost wrote:
[...]
> > >   - Conflicting option handling

Thanks for putting in the extra code to explicitly indicate which
conflicting options were used.

> > >   - Documentation
> > >     When writing the documentation I would stress that "pre-schema" and
> > >     "post-schema" be defined in terms of PostgreSQL objects and why they
> > >     are pre vs. post.

Perhaps this is up for some debate, but I find the documentation added
for these options to be lacking the definitions I was looking for, and
the explanation of why they are what they are.  I'm also not sure I
agree with the "Pre-Schema" and "Post-Schema" nomenclature as it doesn't
really fit with the option names or what they do.  Would you consider:

<term><option>--schema-pre-load</option></term>
<listitem>
 <para>
   Pre-Data Load - Minimum amount of the schema required before data
   loading can begin.  This consists mainly of creating the tables
   using the <command>CREATE TABLE</command>.
   This part can be requested using <option>--schema-pre-load</>.
 </para>
</listitem>

<term><option>--schema-post-load</option></term>
<listitem>
 <para>
   Post-Data Load - The rest of the schema definition, including keys,
   indexes, etc.  By putting keys and indexes after the data has been
   loaded the whole process of restoring data is much faster.  This is
   because it is faster to build indexes and check keys in bulk than
   piecemeal as the data is loaded.
   This part can be requested using <option>--schema-post-load</>.
 </para>
</listitem>

Even this doesn't cover everything though- it's too focused on tables
and data loading.  Where do functions go?  What about types?

A couple of additional points:

  - The command-line help hasn't been updated.  Clearly, that also needs
    to be done to consider the documentation aspect complete.

  - There appears to be a bit of mistakenly included additions.  The
    patch to pg_restore.sgml attempts to add in documentation for
    --superuser.  I'm guessing that was unintentional, and looks like
    just a mistaken extra copy&paste.

> > >   - Technically, the patch needs to be updated slightly since another
> > >     pg_dump-related patch was committed recently which also added
> > >     options and thus causes a conflict.

I think this might have just happened again, funny enough.  It's
something that a committer could perhaps fix, but if you're reworking
the patch anyway...

    Thanks,

        Stephen

Attachment

Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Sun, 2008-07-20 at 17:43 -0400, Stephen Frost wrote:

> Perhaps this is up for some debate, but I find the documentation added
> for these options to be lacking the definitions I was looking for, and
> the explanation of why they are what they are.  I'm also not sure I
> agree with the "Pre-Schema" and "Post-Schema" nomenclature as it doesn't
> really fit with the option names or what they do.  Would you consider:

Will reword.

> Even this doesn't cover everything though- it's too focused on tables
> and data loading.  Where do functions go?  What about types?

Yes, it is focused on tables and data loading. What about
functions/types? No relevance here.

>   - The command-line help hasn't been updated.  Clearly, that also needs
>     to be done to consider the documentation aspect complete.
>
>   - There appears to be a bit of mistakenly included additions.  The
>     patch to pg_restore.sgml attempts to add in documentation for
>     --superuser.  I'm guessing that was unintentional, and looks like
>     just a mistaken extra copy&paste.

Thanks, will do.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


Re: pg_dump additional options for performance

From
Stephen Frost
Date:
* Simon Riggs (simon@2ndquadrant.com) wrote:
> On Sun, 2008-07-20 at 17:43 -0400, Stephen Frost wrote:
> > Even this doesn't cover everything though- it's too focused on tables
> > and data loading.  Where do functions go?  What about types?
>
> Yes, it is focused on tables and data loading. What about
> functions/types? No relevance here.

I don't see how they're not relevant, it's not like they're being
excluded and in fact they show up in the pre-load output.  Heck, even if
they *were* excluded, that should be made clear in the documentation
(either be an explicit include list, or saying they're excluded).

Part of what's driving this is making sure we have a plan for future
objects and where they'll go.  Perhaps it would be enough to just say
"pre-load is everything in the schema, except things which are faster
done in bulk (eg: indexes, keys)".  I don't think it's right to say
pre-load is "only object definitions required to load data" when it
includes functions and ACLs though.

Hopefully my suggestion and these comments will get us to a happy
middle-ground.

    Thanks,

        Stephen

Attachment

Re: pg_dump additional options for performance

From
daveg
Date:
On Sun, Jul 20, 2008 at 09:18:29PM -0400, Stephen Frost wrote:
> * Simon Riggs (simon@2ndquadrant.com) wrote:
> > On Sun, 2008-07-20 at 17:43 -0400, Stephen Frost wrote:
> > > Even this doesn't cover everything though- it's too focused on tables
> > > and data loading.  Where do functions go?  What about types?
> >
> > Yes, it is focused on tables and data loading. What about
> > functions/types? No relevance here.
>
> I don't see how they're not relevant, it's not like they're being
> excluded and in fact they show up in the pre-load output.  Heck, even if
> they *were* excluded, that should be made clear in the documentation
> (either be an explicit include list, or saying they're excluded).
>
> Part of what's driving this is making sure we have a plan for future
> objects and where they'll go.  Perhaps it would be enough to just say
> "pre-load is everything in the schema, except things which are faster
> done in bulk (eg: indexes, keys)".  I don't think it's right to say
> pre-load is "only object definitions required to load data" when it
> includes functions and ACLs though.
>
> Hopefully my suggestion and these comments will get us to a happy
> middle-ground.

One observation, indexes should be built right after the table data
is loaded for each table, this way, the index build gets a hot cache
for the table data instead of having to re-read it later as we do now.

-dg



--
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.

Re: pg_dump additional options for performance

From
Stephen Frost
Date:
* daveg (daveg@sonic.net) wrote:
> One observation, indexes should be built right after the table data
> is loaded for each table, this way, the index build gets a hot cache
> for the table data instead of having to re-read it later as we do now.

That's not how pg_dump has traditionally worked, and the point of this
patch is to add options to easily segregate the main pieces of the
existing pg_dump output (main schema definition, data dump, key/index
building).  You suggestion brings up an interesting point that should
pg_dump's traditional output structure change the "--schema-post-load"
set of objects wouldn't be as clear to newcomers since the load and the
indexes would be interleaved in the regular output.

I'd be curious about the performance impact this has on an actual load
too.  It would probably be more valuable on smaller loads where it would
have less of an impact anyway than on loads larger than the cache size.
Still, not an issue for this patch, imv.

    Thanks,

        Stephen

Attachment

Re: pg_dump additional options for performance

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * daveg (daveg@sonic.net) wrote:
>> One observation, indexes should be built right after the table data
>> is loaded for each table, this way, the index build gets a hot cache
>> for the table data instead of having to re-read it later as we do now.

> That's not how pg_dump has traditionally worked, and the point of this
> patch is to add options to easily segregate the main pieces of the
> existing pg_dump output (main schema definition, data dump, key/index
> building).  You suggestion brings up an interesting point that should
> pg_dump's traditional output structure change the "--schema-post-load"
> set of objects wouldn't be as clear to newcomers since the load and the
> indexes would be interleaved in the regular output.

Yeah.  Also, that is pushing into an entirely different line of
development, which is to enable multithreaded pg_restore.  The patch
at hand is necessarily incompatible with that type of operation, and
wouldn't be used together with it.

As far as the documentation/definition aspect goes, I think it should
just say the parts are
    * stuff needed before you can load the data
    * the data
    * stuff needed after loading the data
and not try to be any more specific than that.  There are corner cases
that will turn any simple breakdown into a lie, and I doubt that it's
worth trying to explain them all.  (Take a close look at the dependency
loop breaking logic in pg_dump if you doubt this.)

I hadn't realized that Simon was using "pre-schema" and "post-schema"
to name the first and third parts.  I'd agree that this is confusing
nomenclature: it looks like it's trying to say that the data is the
schema, and the schema is not!  How about "pre-data and "post-data"?

            regards, tom lane

Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Sun, 2008-07-20 at 21:18 -0400, Stephen Frost wrote:
> * Simon Riggs (simon@2ndquadrant.com) wrote:
> > On Sun, 2008-07-20 at 17:43 -0400, Stephen Frost wrote:
> > > Even this doesn't cover everything though- it's too focused on tables
> > > and data loading.  Where do functions go?  What about types?
> >
> > Yes, it is focused on tables and data loading. What about
> > functions/types? No relevance here.
>
> I don't see how they're not relevant, it's not like they're being
> excluded and in fact they show up in the pre-load output.  Heck, even if
> they *were* excluded, that should be made clear in the documentation
> (either be an explicit include list, or saying they're excluded).
>
> Part of what's driving this is making sure we have a plan for future
> objects and where they'll go.  Perhaps it would be enough to just say
> "pre-load is everything in the schema, except things which are faster
> done in bulk (eg: indexes, keys)".  I don't think it's right to say
> pre-load is "only object definitions required to load data" when it
> includes functions and ACLs though.
>
> Hopefully my suggestion and these comments will get us to a happy
> middle-ground.

I don't really understand what you're saying.

The options split the dump into 3 parts that's all: before the load, the
load and after the load.

--schema-pre-load says
"Dumps exactly what <option>--schema-only</> would dump, but only those
statements before the data load."

What is it you are suggesting? I'm unclear.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Sun, 2008-07-20 at 23:34 -0400, Tom Lane wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * daveg (daveg@sonic.net) wrote:
> >> One observation, indexes should be built right after the table data
> >> is loaded for each table, this way, the index build gets a hot cache
> >> for the table data instead of having to re-read it later as we do now.
>
> > That's not how pg_dump has traditionally worked, and the point of this
> > patch is to add options to easily segregate the main pieces of the
> > existing pg_dump output (main schema definition, data dump, key/index
> > building).  You suggestion brings up an interesting point that should
> > pg_dump's traditional output structure change the "--schema-post-load"
> > set of objects wouldn't be as clear to newcomers since the load and the
> > indexes would be interleaved in the regular output.

Stephen: Agreed.

> Yeah.  Also, that is pushing into an entirely different line of
> development, which is to enable multithreaded pg_restore.  The patch
> at hand is necessarily incompatible with that type of operation, and
> wouldn't be used together with it.
>
> As far as the documentation/definition aspect goes, I think it should
> just say the parts are
>     * stuff needed before you can load the data
>     * the data
>     * stuff needed after loading the data
> and not try to be any more specific than that.  There are corner cases
> that will turn any simple breakdown into a lie, and I doubt that it's
> worth trying to explain them all.  (Take a close look at the dependency
> loop breaking logic in pg_dump if you doubt this.)

Tom: Agreed.

> I hadn't realized that Simon was using "pre-schema" and "post-schema"
> to name the first and third parts.  I'd agree that this is confusing
> nomenclature: it looks like it's trying to say that the data is the
> schema, and the schema is not!  How about "pre-data and "post-data"?

OK by me. Any other takers?



I also suggested having three options
--want-pre-schema
--want-data
--want-post-schema
so we could ask for any or all parts in the one dump. --data-only and
--schema-only are negative options so don't allow this.
(I don't like those names either, just thinking about capabilities)

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


Re: pg_dump additional options for performance

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> I also suggested having three options
> --want-pre-schema
> --want-data
> --want-post-schema
> so we could ask for any or all parts in the one dump. --data-only and
> --schema-only are negative options so don't allow this.
> (I don't like those names either, just thinking about capabilities)

Maybe invert the logic?

    --omit-pre-data
    --omit-data
    --omit-post-data

Not wedded to these either, just tossing out an idea...

            regards, tom lane

Re: pg_dump additional options for performance

From
Stephen Frost
Date:
* Simon Riggs (simon@2ndquadrant.com) wrote:
> The options split the dump into 3 parts that's all: before the load, the
> load and after the load.
>
> --schema-pre-load says
> "Dumps exactly what <option>--schema-only</> would dump, but only those
> statements before the data load."
>
> What is it you are suggesting? I'm unclear.

That part is fine, the problem is that elsewhere in the documentation
(patch line starting ~774 before, ~797 after, to the pg_dump.sgml) you
change it to be "objects required before data loading", which isn't the
same.

    Thanks,

        Stephen

Attachment

Re: pg_dump additional options for performance

From
Andrew Dunstan
Date:

Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
>
>> I also suggested having three options
>> --want-pre-schema
>> --want-data
>> --want-post-schema
>> so we could ask for any or all parts in the one dump. --data-only and
>> --schema-only are negative options so don't allow this.
>> (I don't like those names either, just thinking about capabilities)
>>
>
> Maybe invert the logic?
>
>     --omit-pre-data
>     --omit-data
>     --omit-post-data
>
> Not wedded to these either, just tossing out an idea...
>
>
>

Please, no. Negative logic seems likely to cause endless confusion.

I'd even be happier with --schema-part-1 and --schema-part-2 if we can't
find some more expressive way of designating them.

cheers

andrew

Re: pg_dump additional options for performance

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> Maybe invert the logic?
>> --omit-pre-data
>> --omit-data
>> --omit-post-data

> Please, no. Negative logic seems likely to cause endless confusion.

I think it might actually be less confusing, because with this approach,
each switch has an identifiable default (no) and setting it doesn't
cause side-effects on settings of other switches.  The interactions of
the switches as Simon presents 'em seem less than obvious.

            regards, tom lane

Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Mon, 2008-07-21 at 07:46 -0400, Stephen Frost wrote:
> * Simon Riggs (simon@2ndquadrant.com) wrote:
> > The options split the dump into 3 parts that's all: before the load, the
> > load and after the load.
> >
> > --schema-pre-load says
> > "Dumps exactly what <option>--schema-only</> would dump, but only those
> > statements before the data load."
> >
> > What is it you are suggesting? I'm unclear.
>
> That part is fine, the problem is that elsewhere in the documentation
> (patch line starting ~774 before, ~797 after, to the pg_dump.sgml) you
> change it to be "objects required before data loading", which isn't the
> same.

OK, gotcha now - will change that. I thought you might mean something
about changing the output itself.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


Re: pg_dump additional options for performance

From
Stephen Frost
Date:
Tom,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> As far as the documentation/definition aspect goes, I think it should
> just say the parts are
>     * stuff needed before you can load the data
>     * the data
>     * stuff needed after loading the data
> and not try to be any more specific than that.  There are corner cases
> that will turn any simple breakdown into a lie, and I doubt that it's
> worth trying to explain them all.  (Take a close look at the dependency
> loop breaking logic in pg_dump if you doubt this.)

Even that is a lie though, which I guess is what my problem is.  It's
really "everything for the schema, except stuff that is better done in
bulk", I believe.  Also, I'm a bit concerned about people who would
argue that you need PKs and FKs before you can load the data.  Probably
couldn't be avoided tho.

> I hadn't realized that Simon was using "pre-schema" and "post-schema"
> to name the first and third parts.  I'd agree that this is confusing
> nomenclature: it looks like it's trying to say that the data is the
> schema, and the schema is not!  How about "pre-data and "post-data"?

Argh.  The command-line options follow the 'data'/'load' line
(--schema-pre-load and --schema-post-load), and so I think those are
fine.  The problem was that in the documentation he switched to saying
they were "Pre-Schema" and "Post-Schema", which could lead to confusion.

    Thanks,

        Stephen

Attachment

Re: pg_dump additional options for performance

From
Stephen Frost
Date:
Simon,

* Simon Riggs (simon@2ndquadrant.com) wrote:
> > I hadn't realized that Simon was using "pre-schema" and "post-schema"
> > to name the first and third parts.  I'd agree that this is confusing
> > nomenclature: it looks like it's trying to say that the data is the
> > schema, and the schema is not!  How about "pre-data and "post-data"?
>
> OK by me. Any other takers?

Having the command-line options be "--schema-pre-data" and
"--schema-post-data" is fine with me.  Leaving them the way they are is
also fine by me.  It's the documentation (back to pg_dump.sgml,
~774/~797) that starts talking about Pre-Schema and Post-Schema.

    Thanks,

        Stephen

Attachment

Re: pg_dump additional options for performance

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> As far as the documentation/definition aspect goes, I think it should
>> just say the parts are
>> * stuff needed before you can load the data
>> * the data
>> * stuff needed after loading the data

> Even that is a lie though, which I guess is what my problem is.

True; the stuff done after is done that way at least in part for
performance reasons rather than because it has to be done that way.
(I think it's not only performance issues, though --- for circular
FKs you pretty much have to load the data first.)

>> I hadn't realized that Simon was using "pre-schema" and "post-schema"
>> to name the first and third parts.  I'd agree that this is confusing
>> nomenclature: it looks like it's trying to say that the data is the
>> schema, and the schema is not!  How about "pre-data and "post-data"?

> Argh.  The command-line options follow the 'data'/'load' line
> (--schema-pre-load and --schema-post-load), and so I think those are
> fine.  The problem was that in the documentation he switched to saying
> they were "Pre-Schema" and "Post-Schema", which could lead to confusion.

Ah, I see.  No objection to those switch names, at least assuming we
want to stick to positive-logic switches.  What did you think of the
negative-logic suggestion (--omit-xxx)?

            regards, tom lane

Re: pg_dump additional options for performance

From
Stephen Frost
Date:
Tom, et al,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Ah, I see.  No objection to those switch names, at least assuming we
> want to stick to positive-logic switches.  What did you think of the
> negative-logic suggestion (--omit-xxx)?

My preference is for positive-logic switches in general.  The place
where I would use this patch would lend itself to being more options if
--omit-xxxx were used.  I expect that would hold true for most people.
It would be:

  --omit-data --omit-post-load
  --omit-pre-load --omit-post-load
  --omit-pre-load --omit-data

vs.

  --schema-pre-load
  --data-only
  --schema-post-load

Point being that I'd be dumping these into seperate files where I could
more easily manipulate the pre-load or post-load files.  I'd still want
pre/post load to be seperate though since this would be used in cases
where there's alot of data (hence the reason for the split) and putting
pre and post together and running them before data would slow things
down quite a bit.

Are there use cases for just --omit-post-load or --omit-pre-load?
Probably, but I just don't see any situation where I'd use them like
that.

    Thanks,

        Stephen

Attachment

Re: pg_dump additional options for performance

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> Are there use cases for just --omit-post-load or --omit-pre-load?

Probably not many.  The thing that's bothering me is the
action-at-a-distance property of the positive-logic switches.
How are we going to explain this?

    "By default, --schema-pre-load, --data-only, --schema-post-load
    are all ON.  But if you turn one of them ON (never mind that
    it was already ON by default), that changes the defaults for
    the other two to OFF.  Then you have to turn them ON (never
    mind that the default for them is ON) if you want two out of
    the three categories."

You have to bend your mind into a pretzel to wrap it around this
behavior.  Yeah, it might be convenient once you understand it,
but how long will it take for the savings in typing to repay the
time to understand it and the mistakes along the way?

            regards, tom lane

Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Mon, 2008-07-21 at 19:19 -0400, Tom Lane wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > Are there use cases for just --omit-post-load or --omit-pre-load?
>
> Probably not many.  The thing that's bothering me is the
> action-at-a-distance property of the positive-logic switches.
> How are we going to explain this?
>
>     "By default, --schema-pre-load, --data-only, --schema-post-load
>     are all ON.  But if you turn one of them ON (never mind that
>     it was already ON by default), that changes the defaults for
>     the other two to OFF.  Then you have to turn them ON (never
>     mind that the default for them is ON) if you want two out of
>     the three categories."

While I accept your argument a certain amount, --schema-only and
--data-only already behave in the manner you describe. Whether we pick
include or exclude or both, it will make more sense than these existing
options, regrettably.

With regard to the logic, Insert and COPY also behave this way: if you
mention *any* columns then you only get the ones you mention. We manage
to describe that also. An Insert statement would be very confusing if
you had to list the columns you don't want.

So the --omit options seem OK if you assume we'll never add further
options or include additional SQL in the dump. But that seems an
unreliable prop, so I am inclined towards the inclusive approach.

> You have to bend your mind into a pretzel to wrap it around this
> behavior.

Perhaps my mind was already toroidally challenged? :-}

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Mon, 2008-07-21 at 07:56 -0400, Stephen Frost wrote:
> Simon,
>
> * Simon Riggs (simon@2ndquadrant.com) wrote:
> > > I hadn't realized that Simon was using "pre-schema" and "post-schema"
> > > to name the first and third parts.  I'd agree that this is confusing
> > > nomenclature: it looks like it's trying to say that the data is the
> > > schema, and the schema is not!  How about "pre-data and "post-data"?
> >
> > OK by me. Any other takers?
>
> Having the command-line options be "--schema-pre-data" and
> "--schema-post-data" is fine with me.  Leaving them the way they are is
> also fine by me.  It's the documentation (back to pg_dump.sgml,
> ~774/~797) that starts talking about Pre-Schema and Post-Schema.

OK, Mr.Reviewer, sir:

* patch redone using --schema-before-data and --schema-after-data

* docs rewritten using short clear descriptions using only the words
"before" and "after", like in the option names

* all variable names changed

* retested

So prefixes "pre" and "post" no longer appear anywhere. No latin derived
phrases, just good ol' Anglo-Saxon words.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

Attachment

Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Wed, 2008-07-23 at 17:40 +0100, Simon Riggs wrote:
> On Mon, 2008-07-21 at 07:56 -0400, Stephen Frost wrote:
> > Simon,
> >
> > * Simon Riggs (simon@2ndquadrant.com) wrote:
> > > > I hadn't realized that Simon was using "pre-schema" and "post-schema"
> > > > to name the first and third parts.  I'd agree that this is confusing
> > > > nomenclature: it looks like it's trying to say that the data is the
> > > > schema, and the schema is not!  How about "pre-data and "post-data"?
> > >
> > > OK by me. Any other takers?
> >
> > Having the command-line options be "--schema-pre-data" and
> > "--schema-post-data" is fine with me.  Leaving them the way they are is
> > also fine by me.  It's the documentation (back to pg_dump.sgml,
> > ~774/~797) that starts talking about Pre-Schema and Post-Schema.
>
> OK, Mr.Reviewer, sir:
>
> * patch redone using --schema-before-data and --schema-after-data
>
> * docs rewritten using short clear descriptions using only the words
> "before" and "after", like in the option names
>
> * all variable names changed
>
> * retested
>
> So prefixes "pre" and "post" no longer appear anywhere. No latin derived
> phrases, just good ol' Anglo-Saxon words.

...and with command line help also.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

Attachment

Re: pg_dump additional options for performance

From
Stephen Frost
Date:
Simon,

* Simon Riggs (simon@2ndquadrant.com) wrote:
> ...and with command line help also.

The documentation and whatnot looks good to me now.  There are a couple
of other issues I found while looking through and testing the patch
though-

Index: src/bin/pg_dump/pg_dump.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/bin/pg_dump/pg_dump.c,v
retrieving revision 1.497
diff -c -r1.497 pg_dump.c
*** src/bin/pg_dump/pg_dump.c    20 Jul 2008 18:43:30 -0000    1.497
--- src/bin/pg_dump/pg_dump.c    23 Jul 2008 17:04:24 -0000
***************
*** 225,232 ****
      RestoreOptions *ropt;

      static int    disable_triggers = 0;
!     static int  outputNoTablespaces = 0;
      static int    use_setsessauth = 0;

      static struct option long_options[] = {
          {"data-only", no_argument, NULL, 'a'},
--- 229,238 ----
      RestoreOptions *ropt;

      static int    disable_triggers = 0;
!     static int outputNoTablespaces = 0;
      static int    use_setsessauth = 0;
+     static int    use_schemaBeforeData;
+     static int    use_schemaAfterData;

      static struct option long_options[] = {
          {"data-only", no_argument, NULL, 'a'},
***************

This hunk appears to have a bit of gratuitous whitespace change, not a
big deal tho.

***************
*** 464,474 ****
[...]
+     if (dataOnly)
+         dumpObjFlags = REQ_DATA;
+
+     if (use_schemaBeforeData == 1)
+         dumpObjFlags = REQ_SCHEMA_BEFORE_DATA;
+
+     if (use_schemaAfterData == 1)
+         dumpObjFlags = REQ_SCHEMA_AFTER_DATA;
+
+     if (schemaOnly)
+         dumpObjFlags = (REQ_SCHEMA_BEFORE_DATA | REQ_SCHEMA_AFTER_DATA);
***************

It wouldn't kill to be consistant between testing for '== 1' and just
checking for non-zero.  Again, not really a big deal, and I wouldn't
mention these if there weren't other issues.

***************
*** 646,652 ****
       * Dumping blobs is now default unless we saw an inclusion switch or -s
       * ... but even if we did see one of these, -b turns it back on.
       */
!     if (include_everything && !schemaOnly)
          outputBlobs = true;

      /*
--- 689,695 ----
       * Dumping blobs is now default unless we saw an inclusion switch or -s
       * ... but even if we did see one of these, -b turns it back on.
       */
!     if (include_everything && WANT_PRE_SCHEMA(dumpObjFlags))
          outputBlobs = true;

      /*
***************

Shouldn't this change be to "WANT_DATA(dumpObjFlags)"?  That's what most
of the '!schemaOnly' get translated to.  Otherwise I think you would be
getting blobs when you've asked for just schema-before-data, which
doesn't seem like it'd make much sense.

***************
*** 712,718 ****
      dumpStdStrings(g_fout);

      /* The database item is always next, unless we don't want it at all */
!     if (include_everything && !dataOnly)
          dumpDatabase(g_fout);

      /* Now the rearrangeable objects. */
--- 755,761 ----
      dumpStdStrings(g_fout);

      /* The database item is always next, unless we don't want it at all */
!     if (include_everything && WANT_DATA(dumpObjFlags))
          dumpDatabase(g_fout);

      /* Now the rearrangeable objects. */
***************

Shouldn't this be 'WANT_PRE_SCHEMA(dumpObjFlags)'?

***************
*** 3414,3420 ****
              continue;

          /* Ignore indexes of tables not to be dumped */
!         if (!tbinfo->dobj.dump)
              continue;

          if (g_verbose)
--- 3459,3465 ----
              continue;

          /* Ignore indexes of tables not to be dumped */
!         if (!tbinfo->dobj.dump || !WANT_POST_SCHEMA(dumpObjFlags))
              continue;

          if (g_verbose)
***************

I didn't test this, but it strikes me as an unnecessary addition?  If
anything, wouldn't this check make more sense being done right after
dropping into getIndexes()?  No sense going through the loop just for
fun..  Technically, it's a behavioral change for --data-only since it
used to gather index information anyway, but it's a good optimization if
done in the right place.

Also around here, there doesn't appear to be any checking in
dumpEnumType(), which strikes me as odd.  Wouldn't that deserve a

if (!WANT_PRE_SCHEMA(dumpObjFlags))
    return;

check?  If not even some kind of equivilant ->dobj.dump check..

***************
*** 9803,9809 ****
                      tbinfo->dobj.catId, 0, tbinfo->dobj.dumpId);
      }

!     if (!schemaOnly)
      {
          resetPQExpBuffer(query);
          appendPQExpBuffer(query, "SELECT pg_catalog.setval(");
--- 9848,9854 ----
                      tbinfo->dobj.catId, 0, tbinfo->dobj.dumpId);
      }

!     if (WANT_PRE_SCHEMA(dumpObjFlags))
      {
          resetPQExpBuffer(query);
          appendPQExpBuffer(query, "SELECT pg_catalog.setval(");
***************

This is a mistaken logic invert, which results in setval's not being
dumped at all when pulling out each piece seperately.  It should be:

if (WANT_DATA(dumpObjFlags))

so that setval's are correctly included on the --data-only piece.  As
--data-only previously existed, this would be a regression too.

Index: src/bin/pg_dump/pg_restore.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/bin/pg_dump/pg_restore.c,v
retrieving revision 1.88
diff -c -r1.88 pg_restore.c
*** src/bin/pg_dump/pg_restore.c    13 Apr 2008 03:49:22 -0000    1.88
--- src/bin/pg_dump/pg_restore.c    23 Jul 2008 17:06:59 -0000
+     if (dataOnly)
+         dumpObjFlags = REQ_DATA;
+
+     if (use_schemaBeforeData == 1)
+         dumpObjFlags = REQ_SCHEMA_BEFORE_DATA;
+
+     if (use_schemaAfterData == 1)
+         dumpObjFlags = REQ_SCHEMA_AFTER_DATA;
+
+     if (schemaOnly)
+         dumpObjFlags = (REQ_SCHEMA_BEFORE_DATA | REQ_SCHEMA_AFTER_DATA);
+
***************

Ditto previous comment on this, but in pg_restore.c.

***************
*** 405,410 ****
--- 455,462 ----
               "                           do not restore data of tables that could not be\n"
               "                           created\n"));
      printf(_("  --no-tablespaces         do not dump tablespace assignments\n"));
+     printf(_("  --schema-before-data     dump only the part of schema before table data\n"));
+     printf(_("  --schema-after-data         dump only the part of schema after table data\n"));
      printf(_("  --use-set-session-authorization\n"
               "                           use SESSION AUTHORIZATION commands instead of\n"
               "                           OWNER TO commands\n"));
***************

Forgot to mention this on pg_dump.c, but in both pg_dump and pg_restore,
and I hate to be the bearer of bad news, but your command-line
documentation doesn't line up properly in the output.  You shouldn't be
using tabs there but instead should use spaces as the other help text
does, so everything lines up nicely.

    Thanks,

        Stephen

Attachment

Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Wed, 2008-07-23 at 23:20 -0400, Stephen Frost wrote:

> * Simon Riggs (simon@2ndquadrant.com) wrote:
> > ...and with command line help also.
>
> The documentation and whatnot looks good to me now.  There are a couple
> of other issues I found while looking through and testing the patch
> though-

Thanks for a good review.

> Index: src/bin/pg_dump/pg_dump.c
> ===================================================================
> RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/bin/pg_dump/pg_dump.c,v
> retrieving revision 1.497
> diff -c -r1.497 pg_dump.c
> *** src/bin/pg_dump/pg_dump.c    20 Jul 2008 18:43:30 -0000    1.497
> --- src/bin/pg_dump/pg_dump.c    23 Jul 2008 17:04:24 -0000
> ***************
> *** 225,232 ****
>       RestoreOptions *ropt;
>
>       static int    disable_triggers = 0;
> !     static int  outputNoTablespaces = 0;
>       static int    use_setsessauth = 0;
>
>       static struct option long_options[] = {
>           {"data-only", no_argument, NULL, 'a'},
> --- 229,238 ----
>       RestoreOptions *ropt;
>
>       static int    disable_triggers = 0;
> !     static int outputNoTablespaces = 0;
>       static int    use_setsessauth = 0;
> +     static int    use_schemaBeforeData;
> +     static int    use_schemaAfterData;
>
>       static struct option long_options[] = {
>           {"data-only", no_argument, NULL, 'a'},
> ***************
>
> This hunk appears to have a bit of gratuitous whitespace change, not a
> big deal tho.

OK

> ***************
> *** 464,474 ****
> [...]
> +     if (dataOnly)
> +         dumpObjFlags = REQ_DATA;
> +
> +     if (use_schemaBeforeData == 1)
> +         dumpObjFlags = REQ_SCHEMA_BEFORE_DATA;
> +
> +     if (use_schemaAfterData == 1)
> +         dumpObjFlags = REQ_SCHEMA_AFTER_DATA;
> +
> +     if (schemaOnly)
> +         dumpObjFlags = (REQ_SCHEMA_BEFORE_DATA | REQ_SCHEMA_AFTER_DATA);
> ***************
>
> It wouldn't kill to be consistant between testing for '== 1' and just
> checking for non-zero.  Again, not really a big deal, and I wouldn't
> mention these if there weren't other issues.

OK

> ***************
> *** 646,652 ****
>        * Dumping blobs is now default unless we saw an inclusion switch or -s
>        * ... but even if we did see one of these, -b turns it back on.
>        */
> !     if (include_everything && !schemaOnly)
>           outputBlobs = true;
>
>       /*
> --- 689,695 ----
>        * Dumping blobs is now default unless we saw an inclusion switch or -s
>        * ... but even if we did see one of these, -b turns it back on.
>        */
> !     if (include_everything && WANT_PRE_SCHEMA(dumpObjFlags))
>           outputBlobs = true;
>
>       /*
> ***************
>
> Shouldn't this change be to "WANT_DATA(dumpObjFlags)"?  That's what most
> of the '!schemaOnly' get translated to.  Otherwise I think you would be
> getting blobs when you've asked for just schema-before-data, which
> doesn't seem like it'd make much sense.

Yes, fixed

> ***************
> *** 712,718 ****
>       dumpStdStrings(g_fout);
>
>       /* The database item is always next, unless we don't want it at all */
> !     if (include_everything && !dataOnly)
>           dumpDatabase(g_fout);
>
>       /* Now the rearrangeable objects. */
> --- 755,761 ----
>       dumpStdStrings(g_fout);
>
>       /* The database item is always next, unless we don't want it at all */
> !     if (include_everything && WANT_DATA(dumpObjFlags))
>           dumpDatabase(g_fout);
>
>       /* Now the rearrangeable objects. */
> ***************
>
> Shouldn't this be 'WANT_PRE_SCHEMA(dumpObjFlags)'?

Yes, fixed

> ***************
> *** 3414,3420 ****
>               continue;
>
>           /* Ignore indexes of tables not to be dumped */
> !         if (!tbinfo->dobj.dump)
>               continue;
>
>           if (g_verbose)
> --- 3459,3465 ----
>               continue;
>
>           /* Ignore indexes of tables not to be dumped */
> !         if (!tbinfo->dobj.dump || !WANT_POST_SCHEMA(dumpObjFlags))
>               continue;
>
>           if (g_verbose)
> ***************
>
> I didn't test this, but it strikes me as an unnecessary addition?  If
> anything, wouldn't this check make more sense being done right after
> dropping into getIndexes()?  No sense going through the loop just for
> fun..  Technically, it's a behavioral change for --data-only since it
> used to gather index information anyway, but it's a good optimization if
> done in the right place.

Agreed. I've just removed this. Patch not about optimising logic.

> Also around here, there doesn't appear to be any checking in
> dumpEnumType(), which strikes me as odd.  Wouldn't that deserve a
>
> if (!WANT_PRE_SCHEMA(dumpObjFlags))
>     return;
>
> check?  If not even some kind of equivilant ->dobj.dump check..

Agreed. That appears to be a bug in pg_dump, since this would currently
dump enums if --data-only was specified, which in my understanding would
be wrong.

Have included this:

    /* Skip if not to be dumped */
    if (!tinfo->dobj.dump || !WANT_BEFORE_SCHEMA(dumpObjFlags))
        return;



> ***************
> *** 9803,9809 ****
>                       tbinfo->dobj.catId, 0, tbinfo->dobj.dumpId);
>       }
>
> !     if (!schemaOnly)
>       {
>           resetPQExpBuffer(query);
>           appendPQExpBuffer(query, "SELECT pg_catalog.setval(");
> --- 9848,9854 ----
>                       tbinfo->dobj.catId, 0, tbinfo->dobj.dumpId);
>       }
>
> !     if (WANT_PRE_SCHEMA(dumpObjFlags))
>       {
>           resetPQExpBuffer(query);
>           appendPQExpBuffer(query, "SELECT pg_catalog.setval(");
> ***************
>
> This is a mistaken logic invert, which results in setval's not being
> dumped at all when pulling out each piece seperately.  It should be:
>
> if (WANT_DATA(dumpObjFlags))
>
> so that setval's are correctly included on the --data-only piece.  As
> --data-only previously existed, this would be a regression too.

OK, fixed.


> Index: src/bin/pg_dump/pg_restore.c
> ===================================================================
> RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/bin/pg_dump/pg_restore.c,v
> retrieving revision 1.88
> diff -c -r1.88 pg_restore.c
> *** src/bin/pg_dump/pg_restore.c    13 Apr 2008 03:49:22 -0000    1.88
> --- src/bin/pg_dump/pg_restore.c    23 Jul 2008 17:06:59 -0000
> +     if (dataOnly)
> +         dumpObjFlags = REQ_DATA;
> +
> +     if (use_schemaBeforeData == 1)
> +         dumpObjFlags = REQ_SCHEMA_BEFORE_DATA;
> +
> +     if (use_schemaAfterData == 1)
> +         dumpObjFlags = REQ_SCHEMA_AFTER_DATA;
> +
> +     if (schemaOnly)
> +         dumpObjFlags = (REQ_SCHEMA_BEFORE_DATA | REQ_SCHEMA_AFTER_DATA);
> +
> ***************
>
> Ditto previous comment on this, but in pg_restore.c.

OK

>
> ***************
> *** 405,410 ****
> --- 455,462 ----
>                "                           do not restore data of tables that could not be\n"
>                "                           created\n"));
>       printf(_("  --no-tablespaces         do not dump tablespace assignments\n"));
> +     printf(_("  --schema-before-data     dump only the part of schema before table data\n"));
> +     printf(_("  --schema-after-data         dump only the part of schema after table data\n"));
>       printf(_("  --use-set-session-authorization\n"
>                "                           use SESSION AUTHORIZATION commands instead of\n"
>                "                           OWNER TO commands\n"));
> ***************
>
> Forgot to mention this on pg_dump.c, but in both pg_dump and pg_restore,
> and I hate to be the bearer of bad news, but your command-line
> documentation doesn't line up properly in the output.  You shouldn't be
> using tabs there but instead should use spaces as the other help text
> does, so everything lines up nicely.

OK

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

Attachment

Re: pg_dump additional options for performance

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> [80k patch]

Surely there is a whole lot of unintended noise in this patch?
I certainly don't believe that you meant to change keywords.c
for instance.

            regards, tom lane

Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Thu, 2008-07-24 at 03:54 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > [80k patch]
>
> Surely there is a whole lot of unintended noise in this patch?
> I certainly don't believe that you meant to change keywords.c
> for instance.

Removed, thanks.

Unrelated to this patch, it seems I have some issues with my repository,
judging by this and another unrelated issue reported by Martin Zaun.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

Attachment

Re: pg_dump additional options for performance

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> [ pg_dump_beforeafter.v6.patch ]

I looked over this patch a bit.  I have a proposal for a slightly
different way of defining the new switches:

* --schema-before-data, --data-only, and --schema-after-data can be
specified in any combination to obtain any subset of the full dump.
If none are specified (which would in itself be a useless combination)
then the default is to dump all three sections, just as if all three
were specified.

* --schema-only is defined as equivalent to specifying both
--schema-before-data and --schema-after-data.

The patch as submitted enforces what seem largely arbitrary restrictions
on combining these switches.  It made some sense before to treat
specifying both --schema-only and --data-only as an error, but it's not
clear to me why you shouldn't be able to write both --schema-before-data
and --schema-after-data, especially when there's a switch right beside
them that appears to be equivalent to that combination.  So let's just
allow all the combinations.

The attached updated patch implements and documents this behavior,
and gets rid of the special linkage between --disable-triggers and
--data-only as previously discussed.

Unfortunately there's still a lot of work to do, and I don't feel like
doing it so I'm bouncing this patch back for further work.

The key problem is that pg_restore is broken: it emits nearly the same
output for --schema-before-data and --schema-after-data, because it
doesn't have any way to distinguish which objects in a full dump file
belong where.  This is because the filtering logic was put in the wrong
place, namely in the ArchiveEntry creation routines in pg_dump.c, when
where it really needs to happen is while scanning the TocEntry list in
RestoreArchive().  (Note: it is perhaps worth keeping the pg_dump.c
filters so as to avoid doing extra server queries for objects that we
aren't going to dump anyway, but the core logic has to be in
RestoreArchive.)

Another issue is that the rules for deciding which objects are "before
data" and which are "after data" are wrong.  In particular ACLs are after
data not before data, which is relatively easy to fix.  Not so easy to fix
is that COMMENTs might be either before or after data depending on what
kind of object they are attached to.

(BTW, what about BLOB COMMENTS?  They definitely can't be "before data".
ISTM you could make a case for them being "after data", if you think that
comments are always schema.  But there is also a case for considering
them as data, because the objects they are attached to are data.  I kind
of like the latter approach because it would create an invariant that
comments appear in the same dump section as the object commented on.
Thoughts?)

Implementing the filtering by examining the type of a TocEntry in
RestoreArchive is a bit of a PITA, but it's probably possible.  The
main bad thing about that is the need to make an explicit list of every
type of TocEntry that exists now or ever has been emitted by any past
version of pg_dump.  The design concept was that the type tags are
mainly documentation, and while we've had to bend that in places (mostly
for backward-compatibility reasons) this would be the first place we'd
have to throw it overboard completely.

And there's yet another issue here, which is that it's not entirely clear
that the type of an object uniquely determines whether it's before or
after data.  This might be an emergent property of the object sorting
rules, but there is certainly not anything positively guaranteeing that
the dependency-driven topological sort will produce such a result, and
especially not that that will always be true in the future.  So the
approach seems a bit fragile.

We could perhaps get rid of that problem, as well as the need to implement
object-type-determination logic, if we were to make RestoreArchive define
the groupings according to position in the TocEntry list: everything
before the first TABLE DATA or BLOB (and BLOB COMMENT?) entry is "before"
data, everything after the last one is "after" data, everything in between
is data.  Then we only need to identify object types that are considered
"data", which we already have a rule for (whether hadDumper is true).
This is pretty attractive until you stop to consider the possibility
that there aren't any data entries in an archive (ie, it was made with
--schema-only): then there's no way to identify the boundary points.

We could solve that problem by inserting a "dummy data" TOC entry where
the data would have appeared, but this will only work in new archive
files.  With an implementation like this, pg_restore with
--schema-before-data or --schema-after-data won't behave very nicely on a
pre-8.4 --schema-only archive file.  (Presumably it would act as though
all the objects were "before" data.)  Is that a small enough corner case
to live with in order to gain implementation simplicity and robustness?

BTW, another incomplete item is that pg_dumpall should probably be taught
to accept and pass down --schema-before-data and --schema-after-data
switches.

            regards, tom lane


Attachment

Re: pg_dump additional options for performance

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


Tom Lane wrote:
> * --schema-before-data, --data-only, and --schema-after-data can be

I thought you were arguing for some better names at one point? Those seem
very confusing to me, especially "--schema-after-data". I know it means
"the parts of the schema that come after the data" but it could
also be read as other ways, including "put the schema after the data"  - which
makes no sense, but the name is not exactly intuitive either. "Pre" and "Post"
at least are slightly better, IMO. How about --pre-data-schema
and --post-data-schema? Or --pre-data-section and --post-data-section?
Or (my favorites) --pre-data-commands and --post-data-commands? As the
existing docs say, "commands" are what we are generating.

> them as data, because the objects they are attached to are data.  I kind
> of like the latter approach because it would create an invariant that
> comments appear in the same dump section as the object commented on.
> Thoughts?)

+1 on putting them next to the object commented on.

> And there's yet another issue here, which is that it's not entirely clear
> that the type of an object uniquely determines whether it's before or
> after data.

Wouldn't that be a problem with current dumps as well then?

> We could solve that problem by inserting a "dummy data" TOC entry where
> the data would have appeared, but this will only work in new archive
> files.  With an implementation like this, pg_restore with
> --schema-before-data or --schema-after-data won't behave very nicely on a
> pre-8.4 --schema-only archive file.  (Presumably it would act as though
> all the objects were "before" data.)  Is that a small enough corner case
> to live with in order to gain implementation simplicity and robustness?

I'm not comfortable with corner cases for pg_restore backwards compatibility.
What exactly would happen (worse case) in that scenario?


- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200807252235
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkiKjgMACgkQvJuQZxSWSsiRMACg7c/VDo9hTTjukkFFvLYI31mL
BqkAn3FfepllvVnIwX+efA5cLPlVbDd0
=V/Sv
-----END PGP SIGNATURE-----



Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Fri, 2008-07-25 at 19:16 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > [ pg_dump_beforeafter.v6.patch ]

> Unfortunately there's still a lot of work to do, and I don't feel like
> doing it so I'm bouncing this patch back for further work.

Fair enough. Thanks for the review.

> The key problem is that pg_restore is broken: it emits nearly the same
> output for --schema-before-data and --schema-after-data, because it
> doesn't have any way to distinguish which objects in a full dump file
> belong where.  This is because the filtering logic was put in the wrong
> place, namely in the ArchiveEntry creation routines in pg_dump.c, when
> where it really needs to happen is while scanning the TocEntry list in
> RestoreArchive().  (Note: it is perhaps worth keeping the pg_dump.c
> filters so as to avoid doing extra server queries for objects that we
> aren't going to dump anyway, but the core logic has to be in
> RestoreArchive.)

My feeling is that this would take the patch off-track.

The key capability here is being able to split the dump into multiple
pieces. The equivalent capability on restore is *not* required, because
once the dump has been split the restore never needs to be. It might
seem that the patch should be symmetrical with respect to pg_dump and
pg_restore, but I see no use case for the pg_restore case.

The title of this email confirms that as original intention.

> I looked over this patch a bit.  I have a proposal for a slightly
> different way of defining the new switches:
>
> * --schema-before-data, --data-only, and --schema-after-data can be
> specified in any combination to obtain any subset of the full dump.
> If none are specified (which would in itself be a useless combination)
> then the default is to dump all three sections, just as if all three
> were specified.
>
> * --schema-only is defined as equivalent to specifying both
> --schema-before-data and --schema-after-data.
>
> The patch as submitted enforces what seem largely arbitrary restrictions
> on combining these switches.  It made some sense before to treat
> specifying both --schema-only and --data-only as an error, but it's not
> clear to me why you shouldn't be able to write both --schema-before-data
> and --schema-after-data, especially when there's a switch right beside
> them that appears to be equivalent to that combination.  So let's just
> allow all the combinations.

I had it both ways at various points in development. I'm happy with what
you propose.

> The attached updated patch implements and documents this behavior,
> and gets rid of the special linkage between --disable-triggers and
> --data-only as previously discussed.

OK

> Another issue is that the rules for deciding which objects are "before
> data" and which are "after data" are wrong.  In particular ACLs are after
> data not before data, which is relatively easy to fix.

OK

> Not so easy to fix
> is that COMMENTs might be either before or after data depending on what
> kind of object they are attached to.

Is there anything to fix? Comments are added by calls to dumpComment,
which are always made in conjunction with the dump of an object. So if
you dump the object you dump the comment. As long as objects are
correctly split out then comments will be also.

> (BTW, what about BLOB COMMENTS?  They definitely can't be "before data".
> ISTM you could make a case for them being "after data", if you think that
> comments are always schema.  But there is also a case for considering
> them as data, because the objects they are attached to are data.  I kind
> of like the latter approach because it would create an invariant that
> comments appear in the same dump section as the object commented on.
> Thoughts?)

Yes, data. I'll look at this.

> Implementing the filtering by examining the type of a TocEntry in
> RestoreArchive is a bit of a PITA, but it's probably possible.  The
> main bad thing about that is the need to make an explicit list of every
> type of TocEntry that exists now or ever has been emitted by any past
> version of pg_dump.  The design concept was that the type tags are
> mainly documentation, and while we've had to bend that in places (mostly
> for backward-compatibility reasons) this would be the first place we'd
> have to throw it overboard completely.
>
> And there's yet another issue here, which is that it's not entirely clear
> that the type of an object uniquely determines whether it's before or
> after data.  This might be an emergent property of the object sorting
> rules, but there is certainly not anything positively guaranteeing that
> the dependency-driven topological sort will produce such a result, and
> especially not that that will always be true in the future.  So the
> approach seems a bit fragile.

Don't understand that. Objects are sorted in well-defined order,
specified in pg_dump_sort.c. Essentially we are saying that (according
to current numbering)

--schema-before-data priority 1-8
--data-only          priority 9-11
--schema-after-data  priority 12+

So the sort is explicitly defined, not implicit. I can add comments to
ensure that people changing the priority of objects across those
boundaries would be causing problems.

> We could perhaps get rid of that problem, as well as the need to implement
> object-type-determination logic, if we were to make RestoreArchive define
> the groupings according to position in the TocEntry list: everything
> before the first TABLE DATA or BLOB (and BLOB COMMENT?) entry is "before"
> data, everything after the last one is "after" data, everything in between
> is data.  Then we only need to identify object types that are considered
> "data", which we already have a rule for (whether hadDumper is true).
> This is pretty attractive until you stop to consider the possibility
> that there aren't any data entries in an archive (ie, it was made with
> --schema-only): then there's no way to identify the boundary points.
>
> We could solve that problem by inserting a "dummy data" TOC entry where
> the data would have appeared, but this will only work in new archive
> files.  With an implementation like this, pg_restore with
> --schema-before-data or --schema-after-data won't behave very nicely on a
> pre-8.4 --schema-only archive file.  (Presumably it would act as though
> all the objects were "before" data.)  Is that a small enough corner case
> to live with in order to gain implementation simplicity and robustness?

All of the above makes me certain I want to remove these options from
pg_restore.

> BTW, another incomplete item is that pg_dumpall should probably be taught
> to accept and pass down --schema-before-data and --schema-after-data
> switches.

OK


I'm conscious that the major work proposed will take weeks to complete
and we don't know what other problems it will cause (but I'm pretty
certain it will cause some). With regard to the use case, I see little
or no benefit from either of us doing that and regret I won't be able to
complete that.

Can we prune down to the base use case to avoid this overhead? i.e. have
these options on pg_dump only?

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


Re: pg_dump additional options for performance

From
Stephen Frost
Date:
* Simon Riggs (simon@2ndquadrant.com) wrote:
> The key capability here is being able to split the dump into multiple
> pieces. The equivalent capability on restore is *not* required, because
> once the dump has been split the restore never needs to be. It might
> seem that the patch should be symmetrical with respect to pg_dump and
> pg_restore, but I see no use case for the pg_restore case.

I'm inclined to agree with this.  It might have been nice to provide a
way to split out already-created dumps, but I suspect that people who
need that probably have already figured out a way to do it (I know I
have..).  We should probably ensure that pg_restore doesn't *break* when
fed a partial dump.

> > The patch as submitted enforces what seem largely arbitrary restrictions
> > on combining these switches.
>
> I had it both ways at various points in development. I'm happy with what
> you propose.

I agree with removing the restrictions.  I don't see much in the way of
use cases, but it reduces code and doesn't cause problems.

> > Another issue is that the rules for deciding which objects are "before
> > data" and which are "after data" are wrong.  In particular ACLs are after
> > data not before data, which is relatively easy to fix.
>
> OK

This was partially why I was complaining about having documentation, and
a policy for that matter, which goes into more detail about why X is before
or after the data.  I agree that they're after today, but I don't see
any particular reason why they should be one or the other.  If we
adopted a policy like I proposed (--schema-post-data is essentially that
which uses the data and is faster done in bulk) then ACLs would be
before, and I tend to feel like it makes more sense that way.

> > Not so easy to fix
> > is that COMMENTs might be either before or after data depending on what
> > kind of object they are attached to.
>
> Is there anything to fix? Comments are added by calls to dumpComment,
> which are always made in conjunction with the dump of an object. So if
> you dump the object you dump the comment. As long as objects are
> correctly split out then comments will be also.

I agree with this, and it follows for BLOB comments- in any case, they
go with the object being dumped at the time of that object getting
dumped.  Comments make sense as an extention of the object, not as a
seperate set of objects to be explicitly placed before or after the
data.

> All of the above makes me certain I want to remove these options from
> pg_restore.

I'm in agreement with this.

> > BTW, another incomplete item is that pg_dumpall should probably be taught
> > to accept and pass down --schema-before-data and --schema-after-data
> > switches.
>
> OK

I could go either way on this.

> Can we prune down to the base use case to avoid this overhead? i.e. have
> these options on pg_dump only?

Makes sense to me.

    Thanks,

        Stephen

Attachment

Re: pg_dump additional options for performance

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> On Fri, 2008-07-25 at 19:16 -0400, Tom Lane wrote:
>> The key problem is that pg_restore is broken:

> The key capability here is being able to split the dump into multiple
> pieces. The equivalent capability on restore is *not* required, because
> once the dump has been split the restore never needs to be.

This argument is nonsense.  The typical usage of this capability, IMHO,
will be

    pg_dump -Fc >whole.dump
    pg_restore --schema-before-data whole.dump >before.sql
    pg_restore --data-only whole.dump >data.sql
    pg_restore --schema-after-data whole.dump >after.sql

followed by editing the schema pieces and then loading.  One reason
is that this gives you a consistent dump, whereas three successive
pg_dump runs could never guarantee any such thing.  Another reason
is that you may well not know when you prepare the dump that you
will need split output, because the requirement to edit the dump
is likely to be realized only when you go to load it.

In any case, why did you put the switches into pg_restore.c if you
thought it wasn't useful for pg_restore to handle them?

>> Not so easy to fix
>> is that COMMENTs might be either before or after data depending on what
>> kind of object they are attached to.

> Is there anything to fix?

Well, yeah.  If you attach a comment to an after-data object and test
--schema-after-data, you'll notice the comment is lost.

>> And there's yet another issue here, which is that it's not entirely clear
>> that the type of an object uniquely determines whether it's before or
>> after data.

> Don't understand that. Objects are sorted in well-defined order,
> specified in pg_dump_sort.

After which we do a topological sort that enforces dependency ordering.
The question to worry about is whether there can ever be a dependency
from a normally-"before" object to a normally-"after" object, which
would cause the dependency sort to move the latter in front of the
former (in one way or another).  I'm not saying that any such case can
occur today, but I don't think it's an impossibility for it to arise in
future.  I don't want this relatively minor feature to be putting limits
on what kinds of dependencies the system can have.

> I'm conscious that the major work proposed will take weeks to complete

I don't think that what I am proposing is that complicated; I would
anticipate it requiring somewhere on the order of two dozen lines of
code.  I was thinking of doing a preliminary loop through the TocEntry
list to identify the ordinal numbers of the first and last data items,
and then the main loop could compare a counter to those numbers to
decide which of the three sections it was in.  Plus you'd need another
ArchiveEntry call someplace to prepare the "dummy data" item if one was
needed.

            regards, tom lane

Re: pg_dump additional options for performance

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
>>> Another issue is that the rules for deciding which objects are "before
>>> data" and which are "after data" are wrong.  In particular ACLs are after
>>> data not before data, which is relatively easy to fix.
>>
>> OK

> This was partially why I was complaining about having documentation, and
> a policy for that matter, which goes into more detail about why X is before
> or after the data.  I agree that they're after today, but I don't see
> any particular reason why they should be one or the other.

If a table's ACL revokes owner insert privilege, and was placed before
the data load steps, those steps would fail.  We are relying on the
default table privileges until we are done doing everything we need to
do to the tables (and perhaps other objects, I'm not sure if there are
any other comparable problems).

            regards, tom lane

Re: pg_dump additional options for performance

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > On Fri, 2008-07-25 at 19:16 -0400, Tom Lane wrote:
> >> The key problem is that pg_restore is broken:
>
> > The key capability here is being able to split the dump into multiple
> > pieces. The equivalent capability on restore is *not* required, because
> > once the dump has been split the restore never needs to be.
>
> This argument is nonsense.  The typical usage of this capability, IMHO,
> will be
>
>     pg_dump -Fc >whole.dump
>     pg_restore --schema-before-data whole.dump >before.sql
>     pg_restore --data-only whole.dump >data.sql
>     pg_restore --schema-after-data whole.dump >after.sql
>
> followed by editing the schema pieces and then loading.

I dislike, and doubt that I'd use, this approach.  At the end of the
day, it ends up processing the same (very large amount of data) multiple
times.  We have >60G dump files sometimes, and there's no way I'm going
to dump that into a single file first if I can avoid it.  What we end up
doing today is --schema-only followed by vi'ing it and splitting it up
by hand, etc, then doing a seperate --data-only dump.

> One reason
> is that this gives you a consistent dump, whereas three successive
> pg_dump runs could never guarantee any such thing.

While this is technically true, in most cases people have control over
the schema bits and would likely be able to ensure that the schema
doesn't change during the time.  At that point it's only the data, which
is still done in a transactional way.

> Another reason is that you may well not know when you prepare the
> dump that you will need split output, because the requirement to edit
> the dump is likely to be realized only when you go to load it.

This is a good point.  My gut reaction is that, at least in my usage, it
would be more about "if it's larger than a gig, I might as well split it
out, just in case I need to touch something".  Honestly, it's rare that
I don't have to make *some* change.  Often that's the point of dumping
it out.

    Thanks,

        Stephen

Attachment

Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Sat, 2008-07-26 at 12:20 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > On Fri, 2008-07-25 at 19:16 -0400, Tom Lane wrote:
> >> The key problem is that pg_restore is broken:
>
> > The key capability here is being able to split the dump into multiple
> > pieces. The equivalent capability on restore is *not* required, because
> > once the dump has been split the restore never needs to be.
>
> This argument is nonsense.
> The typical usage of this capability, IMHO, will be

Arghh! That's not my stated use case!?#*!

I want to dump tables separately for performance reasons. There are
documented tests showing 100% gains using this method. There is no gain
adding this to pg_restore. There is a gain to be had - parallelising
index creation, but this patch doesn't provide parallelisation.

Anyway, clearly time for me to stop and have a break.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


Re: pg_dump additional options for performance

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> This argument is nonsense.  The typical usage of this capability, IMHO,
>> will be
>>
>> pg_dump -Fc >whole.dump
>> pg_restore --schema-before-data whole.dump >before.sql
>> pg_restore --data-only whole.dump >data.sql
>> pg_restore --schema-after-data whole.dump >after.sql
>>
>> followed by editing the schema pieces and then loading.

> I dislike, and doubt that I'd use, this approach.  At the end of the
> day, it ends up processing the same (very large amount of data) multiple
> times.

Well, that's easily avoided: just replace the third step by restoring
directly to the target database.

pg_restore --schema-before-data whole.dump >before.sql
edit before.sql
pg_restore --schema-after-data whole.dump >after.sql
edit after.sql
psql -f before.sql target_db
pg_restore --data-only -d target_db whole.dump
psql -f after.sql target_db

            regards, tom lane

Re: pg_dump additional options for performance

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> I want to dump tables separately for performance reasons. There are
> documented tests showing 100% gains using this method. There is no gain
> adding this to pg_restore. There is a gain to be had - parallelising
> index creation, but this patch doesn't provide parallelisation.

Right, but the parallelization is going to happen sometime, and it is
going to happen in the context of pg_restore.  So I think it's pretty
silly to argue that no one will ever want this feature to work in
pg_restore.

To extend the example I just gave to Stephen, I think a fairly probable
scenario is where you only need to tweak some "before" object
definitions, and then you could do

pg_restore --schema-before-data whole.dump >before.sql
edit before.sql
psql -f before.sql target_db
pg_restore --data-only --schema-after-data -d target_db whole.dump

which (given a parallelizing pg_restore) would do all the time-consuming
steps in a fully parallelized fashion.

            regards, tom lane

Re: pg_dump additional options for performance

From
"Joshua D. Drake"
Date:
On Sat, 2008-07-26 at 13:43 -0400, Tom Lane wrote:
> Stephen Frost <sfrost@snowman.net> writes:

> > I dislike, and doubt that I'd use, this approach.  At the end of the
> > day, it ends up processing the same (very large amount of data) multiple
> > times.
>
> Well, that's easily avoided: just replace the third step by restoring
> directly to the target database.
>
> pg_restore --schema-before-data whole.dump >before.sql
> edit before.sql
> pg_restore --schema-after-data whole.dump >after.sql
> edit after.sql
> psql -f before.sql target_db
> pg_restore --data-only -d target_db whole.dump
> psql -f after.sql target_db

It seems to me we continue to hack a solution without a clear idea of
the problems involved. There are a number of what I would consider
significant issues with the backup / restore facilities as a whole with
PostgreSQL.

1. We use text based backups, even with custom format. We need a fast
binary representation as well.

2. We have no concurrency which means, anyone with any database over 50G
has unacceptable restore times.

3. We have to continue develop hacks to define custom utilization. Why
am I passing pre-data anything? It should be automatic. For example:

pg_backup (not dump, we aren't dumping. Dumping is usually associated
with some sort of crash or fould human behavoir. We are backing up).
   pg_backup -U <user> -D database -F -f mybackup.sqlc

If I were to extract <mybackup.sqlc> I would get:

  mybackup.datatypes
  mybackup.tables
  mybackup.data
  mybackup.primary_keys
  mybackup.indexes
  mybackup.constraints
  mybackup.grants

All would be the SQL representation.

Further I could do this:

   pg_restore -U <user> -D <database> --data-types -f mybackup.sqlc

Which would restore just the SQL representation of the data types.

Or:

   pg_restore -U <user> -D <database> --tables -f mybackup.sqlc

Which would restore *only* the tables. Yes it would error if I didn't
also specify --data-types.

Further we need to have concurrency capability. Once we have restored
datatypes and tables, there is zero reason not to launch connections on
data (and then primary keys and indexes) so:

   pg_restore -U <user> -D <database> -C 4 --full -f mybackup.sqlc

Which would launch four connections to the database, and perform a full
restore per mybackup.sqlc.

Oh and pg_dumpall? It should have been removed right around the release
of 7.2, pg_dump -A please.

Anyway, I leave other peeps to flame me into oblivion.

Sincerely,

Joshua D. Drake

--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate




Re: pg_dump additional options for performance

From
daveg
Date:
On Sat, Jul 26, 2008 at 01:56:14PM -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > I want to dump tables separately for performance reasons. There are
> > documented tests showing 100% gains using this method. There is no gain
> > adding this to pg_restore. There is a gain to be had - parallelising
> > index creation, but this patch doesn't provide parallelisation.
>
> Right, but the parallelization is going to happen sometime, and it is
> going to happen in the context of pg_restore.  So I think it's pretty
> silly to argue that no one will ever want this feature to work in
> pg_restore.
>
> To extend the example I just gave to Stephen, I think a fairly probable
> scenario is where you only need to tweak some "before" object
> definitions, and then you could do
>
> pg_restore --schema-before-data whole.dump >before.sql
> edit before.sql
> psql -f before.sql target_db
> pg_restore --data-only --schema-after-data -d target_db whole.dump
>
> which (given a parallelizing pg_restore) would do all the time-consuming
> steps in a fully parallelized fashion.

A few thoughts about pg_restore performance:

To take advantage of non-logged copy, the create and load should be in
the same transaction.

To take advantage of file and buffer cache, it would be be good to do
indexes immediately after table data. Many tables will be small enough
to fit in cache and this will avoid re-reading them for index builds. This
effect becomes stronger with more indexes on one table. There may also be
some filesytem placement benefit to building the indexes for a table
immediately after loading the data.

The buffer fan file cache advantage also applies to constraint creation,
but this is complicated by the need for indexes and data in the referenced
tables.

It seems that a high performance restore will want to proced in a different
order than the current sort order or that proposed by the before/data/after
patch.

 - The simplest unit of work for parallelism may be the table and its
   "decorations", eg indexes and relational constraints.

 - Sort tables by foreign key dependency so that referenced tables are
   loaded before referencing tables.

 - Do table creation and data load together in one transaction to use
   non-logged copy. Index builds, and constraint creation should follow
   immediately, either as part of the same transaction, or possibly
   parallelized themselves.

Table creation, data load, index builds, and constraint creation could
be packaged up as the unit of work to be done in a subprocess which either
completes or fails as a unit. The worker process would be called with
connection info, a file pointer to the data, and the DDL for the table.
pg_restore would keep a work queue of tables to be restored in FK dependency
order and also do the other schema operations such as functions and types.

-dg

--
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.

Re: pg_dump additional options for performance

From
"Joshua D. Drake"
Date:
Simon Riggs wrote:
> On Sat, 2008-07-26 at 11:03 -0700, Joshua D. Drake wrote:
>
>> 2. We have no concurrency which means, anyone with any database over 50G
>> has unacceptable restore times.
>
> Agreed.

> Sounds good.
>
> Doesn't help with the main element of dump time: one table at a time to
> one output file. We need a way to dump multiple tables concurrently,
> ending in multiple files/filesystems.

Agreed but that is a problem I understand with a solution I don't. I am
all eyes on a way to fix that. One thought I had and please, be gentle
in response was some sort of async transaction capability. I know that
libpq has the ability to send async queries. Is it possible to do this:

send async(copy table to foo)
send async(copy table to bar)
send async(copy table to baz)

Where all three copies are happening in the background?

Sincerely,

Joshua D. Drake


Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Sat, 2008-07-26 at 13:56 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > I want to dump tables separately for performance reasons. There are
> > documented tests showing 100% gains using this method. There is no gain
> > adding this to pg_restore. There is a gain to be had - parallelising
> > index creation, but this patch doesn't provide parallelisation.
>
> Right, but the parallelization is going to happen sometime, and it is
> going to happen in the context of pg_restore.

I honestly think there is less benefit that way than if we consider
things more as a whole:

To do data dump quickly we need to dump different tables to different
disks simultaneously. By its very nature, that cannot end with just a
single file. So the starting point for any restore must be potentially
more than one file.

There are two ways of dumping: either multi-thread pg_dump, or allow
multiple pg_dumps to work together. Second option much less work, same
result. (Either way we also need a way for multiple concurrent sessions
to share a snapshot.)

When restoring, we can then just use multiple pg_restore sessions to
restore the individual data files. Or again we can write a
multi-threaded pg_restore to do the same thing - why would I bother
doing that when I already can? It gains us nothing.

Parallelising the index creation seems best done using concurrent psql.
We've agreed some mods to psql to put multi-sessions in there. If we do
that right, then we can make pg_restore generate a psql script with
multi-session commands scattered appropriately throughout.

Parallel pg_restore is a lot of work for a narrow use case. Concurrent
psql provides a much wider set of use cases.

So fully parallelising dump/restore can be achieved by

* splitting dump into pieces (this patch)
* allowing sessions to share a common snapshot
* concurrent psql
* changes to pg_restore/psql/pg_dump to allow commands to be inserted
which will use concurrent psql features

If we do things this way then we have some useful tools that can be used
in a range of use cases, not just restore.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


Re: pg_dump additional options for performance

From
Simon Riggs
Date:
On Sat, 2008-07-26 at 11:03 -0700, Joshua D. Drake wrote:

> 2. We have no concurrency which means, anyone with any database over 50G
> has unacceptable restore times.

Agreed.

Also the core reason for wanting -w

> 3. We have to continue develop hacks to define custom utilization. Why
> am I passing pre-data anything? It should be automatic. For example:
>
> pg_backup (not dump, we aren't dumping. Dumping is usually associated
> with some sort of crash or fould human behavoir. We are backing up).
>    pg_backup -U <user> -D database -F -f mybackup.sqlc
>
> If I were to extract <mybackup.sqlc> I would get:
>
>   mybackup.datatypes
>   mybackup.tables
>   mybackup.data
>   mybackup.primary_keys
>   mybackup.indexes
>   mybackup.constraints
>   mybackup.grants

Sounds good.

Doesn't help with the main element of dump time: one table at a time to
one output file. We need a way to dump multiple tables concurrently,
ending in multiple files/filesystems.

> Oh and pg_dumpall? It should have been removed right around the release
> of 7.2, pg_dump -A please.

Good idea

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


Re: pg_dump additional options for performance

From
Andrew Dunstan
Date:

Joshua D. Drake wrote:
>
> Agreed but that is a problem I understand with a solution I don't. I
> am all eyes on a way to fix that. One thought I had and please, be
> gentle in response was some sort of async transaction capability. I
> know that libpq has the ability to send async queries. Is it possible
> to do this:
>
> send async(copy table to foo)
> send async(copy table to bar)
> send async(copy table to baz)
>
> Where all three copies are happening in the background?
>
>

IIRC, libpq doesn't let you have more than one async query active at one
time.

cheers

andrew

Re: pg_dump additional options for performance

From
"Joshua D. Drake"
Date:
Andrew Dunstan wrote:
>
>
> Joshua D. Drake wrote:
>>
>> Agreed but that is a problem I understand with a solution I don't. I
>> am all eyes on a way to fix that. One thought I had and please, be
>> gentle in response was some sort of async transaction capability. I
>> know that libpq has the ability to send async queries. Is it possible
>> to do this:
>>
>> send async(copy table to foo)
>> send async(copy table to bar)
>> send async(copy table to baz)
>>
>> Where all three copies are happening in the background?
>>
>>
>
> IIRC, libpq doesn't let you have more than one async query active at one
> time.

Now that I think on it harder, this isn't even a libpq problem (although
its involved), we need the postmaster do be able to do a background
async query. Which is (I am guessing) why libpq can only do one at a time.

Sincerely,

Joshua D. Drake



Re: pg_dump additional options for performance

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > I dislike, and doubt that I'd use, this approach.  At the end of the
> > day, it ends up processing the same (very large amount of data) multiple
> > times.
>
> Well, that's easily avoided: just replace the third step by restoring
> directly to the target database.
>
> pg_restore --schema-before-data whole.dump >before.sql
> edit before.sql
> pg_restore --schema-after-data whole.dump >after.sql
> edit after.sql
> psql -f before.sql target_db
> pg_restore --data-only -d target_db whole.dump
> psql -f after.sql target_db

This would depend on the dump being in the custom format, though I
suppose that ends up being true for any usage of these options.  I've
never really been a fan of the custom format, in large part because it
doesn't really buy you all that much and makes changing things more
difficult (by having to extract out what you want to change, and then
omit it from the restore).

I can see some advantage to having the entire dump contained in a single
file and still being able to pull out pieces based on before/after.
Should we get a binary format which is much faster, I could see myself
being more likely to use pg_restore.  Same for parallelization or, in my
fantasies, the ability to copy schema, tables, indexes, etc, in 'raw' PG
format between servers.  Worse than having to vi an insanely large file,
or split it up to be able to modify the pieces you want, is having to
rebuild indexes, especially GIST ones.  That's another topic though.

    Thanks,

        Stephen

Attachment

Re: pg_dump additional options for performance

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Right, but the parallelization is going to happen sometime, and it is
> going to happen in the context of pg_restore.  So I think it's pretty
> silly to argue that no one will ever want this feature to work in
> pg_restore.

I think you've about convinced me on this, and it annoys me. ;)  Worse
is that it sounds like this might cause the options to not make it in
for 8.4, which would be quite frustrating.

> To extend the example I just gave to Stephen, I think a fairly probable
> scenario is where you only need to tweak some "before" object
> definitions, and then you could do
>
> pg_restore --schema-before-data whole.dump >before.sql
> edit before.sql
> psql -f before.sql target_db
> pg_restore --data-only --schema-after-data -d target_db whole.dump
>
> which (given a parallelizing pg_restore) would do all the time-consuming
> steps in a fully parallelized fashion.

Alright, this has been mulling around in the back of my head a bit and
has now finally surfaced- I like having the whole dump contained in a
single file, but I hate having what ends up being "out-dated" or "wrong"
or "not what was loaded" in the dump file.  Doesn't seem likely to be
possible, but it'd be neat to be able to modify objects in the dump
file.

Also, something which often happens to me is that I need to change the
search_path or the role at the top of a .sql from pg_dump before
restoring it.  Seems like using the custom format would make that
difficult without some pipe/cat/sed magic.  Parallelization would make
using that kind of magic more difficult too, I would guess.  Might be
something to think about.

    Thanks,

        Stephen

Attachment

Re: pg_dump additional options for performance

From
"Joshua D. Drake"
Date:
Stephen Frost wrote:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Stephen Frost <sfrost@snowman.net> writes:
>>> I dislike, and doubt that I'd use, this approach.  At the end of the
>>> day, it ends up processing the same (very large amount of data) multiple
>>> times.

> This would depend on the dump being in the custom format, though I
> suppose that ends up being true for any usage of these options.  I've
> never really been a fan of the custom format, in large part because it
> doesn't really buy you all that much and makes changing things more
> difficult (by having to extract out what you want to change, and then
> omit it from the restore).

Custom format rocks for partial set restores from a whole dump. See the
TOC option :)

Joshua D. Drake

Re: pg_dump additional options for performance

From
Stephen Frost
Date:
* Joshua D. Drake (jd@commandprompt.com) wrote:
> Custom format rocks for partial set restores from a whole dump. See the
> TOC option :)

I imagine it does, but that's very rarely what I need.  Most of the time
we're dumping out a schema to load it into a seperate schema (usually on
another host).  Sometimes that can be done by simply vi'ing the file to
change the search_path and whatnot, though more often we end up pipe'ing
the whole thing through sed.  Since we don't allow regular users to do
much, and you have to 'set role postgres;' to do anything as superuser,
we also often end up adding 'set role postgres;' to the top of the .sql
files.

    Thanks,

        Stephen

Attachment

Re: pg_dump additional options for performance

From
chris
Date:
tgl@sss.pgh.pa.us (Tom Lane) writes:
> Simon Riggs <simon@2ndquadrant.com> writes:
>> I want to dump tables separately for performance reasons. There are
>> documented tests showing 100% gains using this method. There is no gain
>> adding this to pg_restore. There is a gain to be had - parallelising
>> index creation, but this patch doesn't provide parallelisation.
>
> Right, but the parallelization is going to happen sometime, and it is
> going to happen in the context of pg_restore.  So I think it's pretty
> silly to argue that no one will ever want this feature to work in
> pg_restore.

"Never" is a long time, agreed.

> To extend the example I just gave to Stephen, I think a fairly probable
> scenario is where you only need to tweak some "before" object
> definitions, and then you could do
>
> pg_restore --schema-before-data whole.dump >before.sql
> edit before.sql
> psql -f before.sql target_db
> pg_restore --data-only --schema-after-data -d target_db whole.dump
>
> which (given a parallelizing pg_restore) would do all the time-consuming
> steps in a fully parallelized fashion.

Do we need to wait until a fully-parallelizing pg_restore is
implemented before adding this functionality to pg_dump?

The particular extension I'm interested in from pg_dump, here, is the
ability to dump multiple tables concurrently.  I've got disk arrays
with enough I/O bandwidth that this form of parallelization does
provide a performance benefit.

The result of that will be that *many* files are generated, and I
don't imagine we want to change pg_restore to try to make it read from
multiple files concurrently.

Further, it's actually not obvious that we *necessarily* care about
parallelizing loading data.  The thing that happens every day is
backups.  I care rather a lot about optimizing that; we do backups
each and every day, and optimizations to that process will accrue
benefits each and every day.

In contrast, restoring databases does not take place every day.  When
it happens, yes, there's considerable value to making *that* go as
quickly as possible, but I'm quite willing to consider optimizing that
to be separate from optimizing backups.

I daresay I haven't used pg_restore any time recently, either.  Any
time we have thought about using it, we've concluded that the
perceivable benefits were actually more of a mirage.
--
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://cbbrowne.com/info/lsf.html
Rules  of the  Evil Overlord  #145. "My  dungeon cell  decor  will not
feature exposed pipes.  While they add to the  gloomy atmosphere, they
are good  conductors of vibrations and  a lot of  prisoners know Morse
code." <http://www.eviloverlord.com/>

Re: pg_dump additional options for performance

From
Tom Lane
Date:
chris <cbbrowne@ca.afilias.info> writes:
> Do we need to wait until a fully-parallelizing pg_restore is
> implemented before adding this functionality to pg_dump?

They're independent problems ... and I would venture that parallel
dump is harder.

> Further, it's actually not obvious that we *necessarily* care about
> parallelizing loading data.  The thing that happens every day is
> backups.

Maybe so, but I would say that routine backups shouldn't be designed
to eat 100% of your disk bandwidth anyway --- they'd be more like
background tasks.

            regards, tom lane