Re: Further pg_upgrade analysis for many tables - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Further pg_upgrade analysis for many tables
Date
Msg-id CAMkU=1yFfcfVDsX4qK0aOJocsMEDqE+=w=LQDzUvRmD4PNKGXQ@mail.gmail.com
Whole thread Raw
In response to Re: Further pg_upgrade analysis for many tables  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Further pg_upgrade analysis for many tables  (Jeff Janes <jeff.janes@gmail.com>)
Re: Further pg_upgrade analysis for many tables  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On Thu, Nov 8, 2012 at 9:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
>> Are sure the server you are dumping out of is head?
>
> I experimented a bit with dumping/restoring 16000 tables matching
> Bruce's test case (ie, one serial column apiece).  The pg_dump profile
> seems fairly flat, without any easy optimization targets.  But
> restoring the dump script shows a rather interesting backend profile:
>
> samples  %        image name               symbol name
> 30861    39.6289  postgres                 AtEOXact_RelationCache
> 9911     12.7268  postgres                 hash_seq_search
...
>
> There are at least three ways we could whack that mole:
>
> * Run the psql script in --single-transaction mode, as I was mumbling
> about the other day.  If we were doing AtEOXact_RelationCache only once,
> rather than once per CREATE TABLE statement, it wouldn't be a problem.
> Easy but has only a narrow scope of applicability.

That is effective when loading into 9.3 (assuming you make
max_locks_per_transaction large enough).  But when loading into  <9.3,
using --single-transaction will evoke the quadratic behavior in the
resource owner/lock table and make things worse rather than better.


But there is still the question of how people can start using 9.3 if
they can't use pg_upgrade, or use the pg_dump half of the dump/restore
in, order to get there.

It seems to me that pg_upgrade takes some pains to ensure that no one
else attaches to the database during its operation.  In that case, is
it necessary to run the entire dump in a single transaction in order
to get a consistent picture?  The attached crude patch allows pg_dump
to not use a single transaction (and thus not accumulate a huge number
of locks) by using the --pg_upgrade flag.

This seems to remove the quadratic behavior of running pg_dump against
pre-9.3 servers.  It is linear up to 30,000 tables with a single
serial column, at about 1.5 msec per table.

I have no evidence other than a gut feeling that this is a safe thing to do.

I've also tested Tatsuo-san's group-"LOCK TABLE" patch against this
case, and it is minimal help.  The problem is that there is no syntax
for locking sequences, so they cannot be explicitly locked as a group
but rather are implicitly locked one by one and so still suffer from
the quadratic behavior.

Cheers,

Jeff

Attachment

pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Enabling Checksums
Next
From: Jeff Davis
Date:
Subject: Re: Enabling Checksums