Re: Further pg_upgrade analysis for many tables - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Further pg_upgrade analysis for many tables
Date
Msg-id 20121109230641.GC26605@momjian.us
Whole thread Raw
In response to Re: Further pg_upgrade analysis for many tables  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: Further pg_upgrade analysis for many tables  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
On Thu, Nov  8, 2012 at 08:59:21PM -0800, Jeff Janes wrote:
> On Thu, Nov 8, 2012 at 4:33 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > On Thu, Nov  8, 2012 at 03:46:09PM -0800, Jeff Janes wrote:
> >> On Wed, Nov 7, 2012 at 6:17 PM, Bruce Momjian <bruce@momjian.us> wrote:
> >> > As a followup to Magnus's report that pg_upgrade was slow for many
> >> > tables, I did some more testing with many tables, e.g.:
> >> >
> >> ...
> >> >
> >> > Any ideas?  I am attaching my test script.
> >>
> >> Have you reviewed the thread at:
> >> http://archives.postgresql.org/pgsql-performance/2012-09/msg00003.php
> >> ?
> >>
> >> There is a known N^2 behavior when using pg_dump against pre-9.3 servers.
> >
> > I am actually now dumping git head/9.3, so I assume all the problems we
> > know about should be fixed.
>
> Are sure the server you are dumping out of is head?

Well, I tested again with 9.2 dumping/loading 9.2 and the same for git
head, and got these results:

                   pg_dump            restore
               9.2       git       9.2       git

        1      0.13      0.11      0.07      0.07
     1000      4.37      3.98      4.32      5.28
     2000     12.98     12.19     13.64     14.25
     4000     47.85     50.14     61.31     70.97
     8000    210.39    183.00    302.67    294.20
    16000    901.53    769.83   1399.25   1359.09

As you can see, there is very little difference between 9.2 and git
head, except maybe at the 16k level for pg_dump.

Is there some slowdown with a mismatched version dump/reload?  I am
attaching my test script.

> Using head's pg_dump, but 9.2.1 server, it takes me 179.11 seconds to
> dump 16,000 tables (schema only) like your example, and it is
> definitely quadratic.

Are you using a SERIAL column for the tables.  I am, and Peter
Eisentraut reported that was a big slowdown.

> But using head's pg_dump do dump tables out of head's server, it only
> took 24.95 seconds, and the quadratic term is not yet important,
> things still look linear.

Again, using SERIAL?

> But even the 179.11 seconds is several times faster than your report
> of 757.8, so I'm not sure what is going on there.  I don't think my
> laptop is particularly fast:
>
> Intel(R) Pentium(R) CPU B960 @ 2.20GHz

I am using server-grade hardware, Xeon E5620 2.4GHz:

    http://momjian.us/main/blogs/pgblog/2012.html#January_20_2012

> Is the next value, increment, etc. for a sequence stored in a catalog,
> or are they stored in the 8kb file associated with each sequence?  If

Each sequence is stored in its own 1-row 8k table:

    test=> CREATE SEQUENCE seq;
    CREATE SEQUENCE

    test=> SELECT * FROM seq;
    -[ RECORD 1 ]-+--------------------
    sequence_name | seq
    last_value    | 1
    start_value   | 1
    increment_by  | 1
    max_value     | 9223372036854775807
    min_value     | 1
    cache_value   | 1
    log_cnt       | 0
    is_cycled     | f
    is_called     | f

> they are stored in the file, than it is shame that pg_dump goes to the
> effort of extracting that info if pg_upgrade is just going to
> overwrite it anyway.

Actually, pg_upgrade needs pg_dump to restore all those sequence values.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Attachment

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: TRUNCATE SERIALIZABLE and frozen COPY
Next
From: Jeff Davis
Date:
Subject: Re: Enabling Checksums