Re: Further pg_upgrade analysis for many tables - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: Further pg_upgrade analysis for many tables |
Date | |
Msg-id | 20121110171558.GB31383@momjian.us Whole thread Raw |
In response to | Re: Further pg_upgrade analysis for many tables (Jeff Janes <jeff.janes@gmail.com>) |
Responses |
Re: Further pg_upgrade analysis for many tables
|
List | pgsql-hackers |
On Fri, Nov 9, 2012 at 04:23:40PM -0800, Jeff Janes wrote: > On Fri, Nov 9, 2012 at 3:06 PM, Bruce Momjian <bruce@momjian.us> wrote: > > On Thu, Nov 8, 2012 at 08:59:21PM -0800, Jeff Janes wrote: > >> On Thu, Nov 8, 2012 at 4:33 PM, Bruce Momjian <bruce@momjian.us> wrote: > >> > > >> > I am actually now dumping git head/9.3, so I assume all the problems we > >> > know about should be fixed. > >> > >> Are sure the server you are dumping out of is head? > > > > Well, I tested again with 9.2 dumping/loading 9.2 and the same for git > > head, and got these results: > > > > pg_dump restore > > 9.2 git 9.2 git > > > > 1 0.13 0.11 0.07 0.07 > > 1000 4.37 3.98 4.32 5.28 > > 2000 12.98 12.19 13.64 14.25 > > 4000 47.85 50.14 61.31 70.97 > > 8000 210.39 183.00 302.67 294.20 > > 16000 901.53 769.83 1399.25 1359.09 > > For pg_dump, there are 4 possible combinations, not just two. you can > use 9.2's pg_dump to dump from a 9.2 server, use git's pg_dump to dump > from a 9.2 server, use git's pg_dump to dump from a git server, or use > 9.2's pg_dump to dump from a git server (although that last one isn't > very relevant) True, but I thought doing matching versions was a sufficient test. > >> Using head's pg_dump, but 9.2.1 server, it takes me 179.11 seconds to > >> dump 16,000 tables (schema only) like your example, and it is > >> definitely quadratic. > > > > Are you using a SERIAL column for the tables. I am, and Peter > > Eisentraut reported that was a big slowdown. > > Yes, I'm using the same table definition as your example. OK. > >> But using head's pg_dump do dump tables out of head's server, it only > >> took 24.95 seconds, and the quadratic term is not yet important, > >> things still look linear. > > > > Again, using SERIAL? > > Yep. Odd why yours is so much after. > >> they are stored in the file, than it is shame that pg_dump goes to the > >> effort of extracting that info if pg_upgrade is just going to > >> overwrite it anyway. > > > > Actually, pg_upgrade needs pg_dump to restore all those sequence values. > > I did an experiment where I had pg_dump just output dummy values > rather than hitting the database. Once pg_upgrade moves the relation > files over, the dummy values disappear and are set back to their > originals. So I think that pg_upgrade depends on pg_dump only in a > trivial way--they need to be there, but it doesn't matter what they > are. Oh, wow, I had not thought of that. Once we move the sequence files into place from the old cluster, whatever was assigned to the sequence counter by pg_dump restored is thrown away. Good point. I am hesistant to add an optimization to pg_dump to fix this unless we decide that pg_dump uses sequences in some non-optimal way that would not warrant us improving general sequence creation performance. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
pgsql-hackers by date: