Re: Further pg_upgrade analysis for many tables - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Further pg_upgrade analysis for many tables
Date
Msg-id 20121129175919.GB31741@momjian.us
Whole thread Raw
In response to Re: Further pg_upgrade analysis for many tables  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Further pg_upgrade analysis for many tables  (Bruce Momjian <bruce@momjian.us>)
Re: Further pg_upgrade analysis for many tables  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, Nov 28, 2012 at 03:22:32PM -0500, Bruce Momjian wrote:
> On Tue, Nov 27, 2012 at 09:35:10PM -0800, Jeff Janes wrote:
> > > I tested custom format with pg_restore -j and -1, as well as text
> > > restore.  The winner was pg_dump -Fc | pg_restore -1;
> >
> > I don't have the numbers at hand, but if my relcache patch is
> > accepted, then "-1" stops being faster.
> >
> > -1 gets rid of the AtOEXAct relcache N^2 behavior, but at the cost of
> > invoking a different N^2, that one in the stats system.
>
> OK, here are the testing results:
>
>     #tbls       git     -1    AtOEXAct  both
>         1      11.06   13.06   10.99   13.20
>      1000      21.71   22.92   22.20   22.51
>      2000      32.86   31.09   32.51   31.62
>      4000      55.22   49.96   52.50   49.99
>      8000     105.34   82.10   95.32   82.94
>     16000     223.67  164.27  187.40  159.53
>     32000     543.93  324.63  366.44  317.93
>     64000    1697.14  791.82  767.32  752.57
>
> Up to 2k, they are all similar.  4k & 8k have the -1 patch as a win, and
> 16k+ really need both patches.
>
> I will continue working on the -1 patch, and hopefully we can get your
> AtOEXAct patch in soon.  Is someone reviewing that?

I have polished up the patch (attached) and it is ready for application
to 9.3.

Since there is no pg_dump/pg_restore pipe parallelism, I had the old
cluster create per-database dump files, so I don't need to have the old
and new clusters running at the same time, which would have required two
port numbers and make shared memory exhaustion more likely.

We now create a dump file per database, so thousands of database dump
files might cause a performance problem.

This also adds status output so you can see the database names as their
schemas are dumped and restored.  This was requested by users.

I retained custom mode for pg_dump because it is measurably faster than
text mode (not sure why, psql overhead?):

                git     -Fc     -Fp
        1      11.04   11.08   11.02
     1000      22.37   19.68   21.64
     2000      32.39   28.62   31.40
     4000      56.18   48.53   51.15
     8000     105.15   81.23   91.84
    16000     227.64  156.72  177.79
    32000     542.80  323.19  371.81
    64000    1711.77  789.17  865.03

Text dump files are slightly easier to debug, but probably not by much.

Single-transaction restores were recommended to me over a year ago (by
Magnus?), but I wanted to get pg_upgrade rock-solid before doing
optimization, and now is the right time to optimize.

One risk of single-transaction restores is max_locks_per_transaction
exhaustion, but you will need to increase that on the old cluster for
pg_dump anyway because that is done a single transaction, so the only
new thing is that the new cluster might also need to adjust
max_locks_per_transaction.

I was able to remove split_old_dump() because pg_dumpall now produces a
full global restore file and we do database dumps separately.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bugs in CREATE/DROP INDEX CONCURRENTLY
Next
From: Merlin Moncure
Date:
Subject: Re: json accessors