Thread: Re: [GENERAL] pg_dump incredibly slow dumping a single schema from a large db

Mike Roest <mike.roest@replicon.com> writes:
> The file is 6 megs so I've dropped it here.
> That was doing perf for the length of the pg_dump command and then a perf
> report -n
> http://dl.dropbox.com/u/13153/output.txt

Hmm ... that's a remarkably verbose output format, but the useful part
of this info seems to be just

# Events: 2M cpu-clock
#
# Overhead  Samples            Command                  Shared Object
Symbol
# ........ ..........  ...............  .............................
.................................................
#
    65.96%    1635392          pg_dump  pg_dump                        [.] findLoop
              |
              --- findLoop
                 |
                 |--69.66%-- findDependencyLoops
                 |          sortDumpableObjects
                 |          main
                 |          __libc_start_main
                 |
                 |--30.34%-- findLoop
                 |          findDependencyLoops
                 |          sortDumpableObjects
                 |          main
                 |          __libc_start_main
                  --0.00%-- [...]

    10.16%     251955          pg_dump  pg_dump                        [.] getTables
              |
              --- getTables
                  getSchemaData
                  main
                  __libc_start_main


The file also shows that findObjectByDumpId() accounts for only a
negligible percentage of runtime, which means that the recursion path in
findLoop() is being executed hardly at all.  After staring at that for
awhile I realized that what is the O(N^2) part is the initial "is the
object in the workspace?" test.  The actual dependency chains are
probably never very long, but as we run through the collection of
objects we gradually add all of them to the front of the workspace.
So this is dumb; we should manage the "is the object already processed"
component of that with an O(1) check, like a bool array or some such,
rather than an O(N) search loop.

As for the getTables slowdown, the only part of that I can see that
looks to be both significant and entirely contained in getTables()
itself is the nested loop near the end that's trying to copy
the dumpable flags for owned sequences from their owning tables.
Do you have a whole lot of owned sequences?  Maybe we could postpone
that work until we have the fast table lookup array constructed,
which should reduce this from O(M*N) to O(M*logN).

            regards, tom lane

I wrote:
> So this is dumb; we should manage the "is the object already processed"
> component of that with an O(1) check, like a bool array or some such,
> rather than an O(N) search loop.

> As for the getTables slowdown, the only part of that I can see that
> looks to be both significant and entirely contained in getTables()
> itself is the nested loop near the end that's trying to copy
> the dumpable flags for owned sequences from their owning tables.
> Do you have a whole lot of owned sequences?  Maybe we could postpone
> that work until we have the fast table lookup array constructed,
> which should reduce this from O(M*N) to O(M*logN).

I've committed fixes for both these issues.  If you are in a position to
test with 9.1 branch tip from git, it'd be nice to have confirmation
that these patches actually cure your problem.  For both of them, the
issue seems to only show up in a subset of cases, which may explain why
we'd not identified the problem before.

            regards, tom lane

I've committed fixes for both these issues.  If you are in a position to
test with 9.1 branch tip from git, it'd be nice to have confirmation
that these patches actually cure your problem.  For both of them, the
issue seems to only show up in a subset of cases, which may explain why
we'd not identified the problem before.

                       regards, tom lane
Cool,
  I've grabbed your changes and it seems to have completely cured the issue.

reading user-defined tables has gone down to 9 seconds from 2.5 minutes
reading dependency data has gone down to 20 seconds from 5.5 minutes.  

Thanks so much Tom this is perfect.  I'm just pulling another backup using the stock 9.1.1 pg_dump to ensure the backups are equivalent.

Any idea when 9.1.4 with this change will be out so we can pull the cluster up.


With regards to your previous question about sequences there are 61K in the DB, looks like our schema currently has about 115 sequences per tenant.


--Mike



I'm just pulling another backup using the stock 9.1.1 pg_dump to ensure the backups are equivalent.

Schema & data are identical between the 2 backups. the new backup passes all our tests for validating a tenant.

Thank you again for the quick response!

--Mike
 
Mike Roest <mike.roest@replicon.com> writes:
> Any idea when 9.1.4 with this change will be out so we can pull the cluster
> up.

Well, we just did some releases last month, so unless somebody finds a
really nasty security or data-loss issue, I'd think it will be a couple
of months before the next set.

            regards, tom lane

:)  yah that makes sense no big deal.  i'll probably just push this head buiild of pg_dump onto the production machines till it comes out.

Thanks again!

On Sat, Mar 31, 2012 at 3:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mike Roest <mike.roest@replicon.com> writes:
> Any idea when 9.1.4 with this change will be out so we can pull the cluster
> up.

Well, we just did some releases last month, so unless somebody finds a
really nasty security or data-loss issue, I'd think it will be a couple
of months before the next set.

                       regards, tom lane