Home > mailing lists

Re: Further pg_upgrade analysis for many tables - Mailing list pgsql-hackers

From	Bruce Momjian
Subject	Re: Further pg_upgrade analysis for many tables
Date	November 14, 2012 00:04:05
Msg-id	20121114000351.GA13888@momjian.us Whole thread Raw
In response to	Re: Further pg_upgrade analysis for many tables (Ants Aasma <ants@cybertec.at>)
Responses	Re: Further pg_upgrade analysis for many tables Re: Further pg_upgrade analysis for many tables Re: Further pg_upgrade analysis for many tables
List	pgsql-hackers

Tree view

On Tue, Nov 13, 2012 at 05:44:54AM +0200, Ants Aasma wrote:
> On Mon, Nov 12, 2012 at 10:59 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > You can see a significant speedup with those loops removed.  The 16k
> > case is improved, but still not linear.  The 16k dump/restore scale
> > looks fine, so it must be something in pg_upgrade, or in the kernel.
>
> I can confirm the speedup. Profiling results for 9.3 to 9.3 upgrade
> for 8k and 64k tables are attached. pg_upgrade itself is now taking
> negligible time.

I generated these timings from the attached test script.

                 -------------------------- 9.3 ------------------------
                 ---- normal ----  -- binary_upgrade --  -- pg_upgrade -
                - dmp -   - res -   - dmp -   - res -    git       patch
        1         0.12      0.07      0.13      0.07     11.06     11.02
     1000         2.20      2.46      3.57      2.82     19.15     18.61
     2000         4.51      5.01      8.22      5.80     29.12     26.89
     4000         8.97     10.88     14.76     12.43     45.87     43.08
     8000        15.30     24.72     30.57     27.10    100.31     79.75
    16000        36.14     54.88     62.27     61.69    248.03    167.94
    32000        55.29    162.20    115.16    179.15    695.05    376.84
    64000       149.86    716.46    265.77    724.32   2323.73   1122.38

You can see the speedup of the patch, particularly for a greater number
of tables, e.g. 2x faster for 64k tables.

> The 64k profile shows the AtEOXact_RelationCache scaling problem. For
> the 8k profile nothing really pops out as a clear bottleneck. CPU time
> distributes 83.1% to postgres, 4.9% to pg_dump, 7.4% to psql and 0.7%
> to pg_upgrade.

At 64k I see pg_upgrade taking 12% of the duration time, if I subtract
out the dump/restore times.

I am attaching an updated pg_upgrade patch, which I believe is ready for
application for 9.3.

> Postgres time itself breaks down with 10% for shutdown checkpoint and
> 90% for regular running, consisting of 16% parsing, 13% analyze, 20%
> plan, 30% execute, 11% commit (AtEOXact_RelationCache) and 6% network.

That SVG graph was quite impressive.

> It looks to me that most benefit could be had from introducing more
> parallelism. Are there any large roadblocks to pipelining the dump and
> restore to have them happen in parallel?

I talked to Andrew Dustan about parallelization in pg_restore.  First,
we currently use pg_dumpall, which isn't in the custom format required
for parallel restore, but if we changed to custom format, create table
isn't done in parallel, only create index/check constraints, and trigger
creation, etc.  Not sure if it worth perusing this just for pg_upgrade.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Attachment

pgsql-hackers by date:

From: "David Johnston"
Date: 13 November 2012, 23:57:12
Subject: Re: Add contrib module functions to docs' function index

From: Craig Ringer
Date: 14 November 2012, 00:10:42
Subject: Re: Add contrib module functions to docs' function index

Re: Further pg_upgrade analysis for many tables - Mailing list pgsql-hackers

Attachment

Previous

Next