Re: Further pg_upgrade analysis for many tables - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: Further pg_upgrade analysis for many tables |
Date | |
Msg-id | 20121129175919.GB31741@momjian.us Whole thread Raw |
In response to | Re: Further pg_upgrade analysis for many tables (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: Further pg_upgrade analysis for many tables
(Bruce Momjian <bruce@momjian.us>)
Re: Further pg_upgrade analysis for many tables (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
On Wed, Nov 28, 2012 at 03:22:32PM -0500, Bruce Momjian wrote: > On Tue, Nov 27, 2012 at 09:35:10PM -0800, Jeff Janes wrote: > > > I tested custom format with pg_restore -j and -1, as well as text > > > restore. The winner was pg_dump -Fc | pg_restore -1; > > > > I don't have the numbers at hand, but if my relcache patch is > > accepted, then "-1" stops being faster. > > > > -1 gets rid of the AtOEXAct relcache N^2 behavior, but at the cost of > > invoking a different N^2, that one in the stats system. > > OK, here are the testing results: > > #tbls git -1 AtOEXAct both > 1 11.06 13.06 10.99 13.20 > 1000 21.71 22.92 22.20 22.51 > 2000 32.86 31.09 32.51 31.62 > 4000 55.22 49.96 52.50 49.99 > 8000 105.34 82.10 95.32 82.94 > 16000 223.67 164.27 187.40 159.53 > 32000 543.93 324.63 366.44 317.93 > 64000 1697.14 791.82 767.32 752.57 > > Up to 2k, they are all similar. 4k & 8k have the -1 patch as a win, and > 16k+ really need both patches. > > I will continue working on the -1 patch, and hopefully we can get your > AtOEXAct patch in soon. Is someone reviewing that? I have polished up the patch (attached) and it is ready for application to 9.3. Since there is no pg_dump/pg_restore pipe parallelism, I had the old cluster create per-database dump files, so I don't need to have the old and new clusters running at the same time, which would have required two port numbers and make shared memory exhaustion more likely. We now create a dump file per database, so thousands of database dump files might cause a performance problem. This also adds status output so you can see the database names as their schemas are dumped and restored. This was requested by users. I retained custom mode for pg_dump because it is measurably faster than text mode (not sure why, psql overhead?): git -Fc -Fp 1 11.04 11.08 11.02 1000 22.37 19.68 21.64 2000 32.39 28.62 31.40 4000 56.18 48.53 51.15 8000 105.15 81.23 91.84 16000 227.64 156.72 177.79 32000 542.80 323.19 371.81 64000 1711.77 789.17 865.03 Text dump files are slightly easier to debug, but probably not by much. Single-transaction restores were recommended to me over a year ago (by Magnus?), but I wanted to get pg_upgrade rock-solid before doing optimization, and now is the right time to optimize. One risk of single-transaction restores is max_locks_per_transaction exhaustion, but you will need to increase that on the old cluster for pg_dump anyway because that is done a single transaction, so the only new thing is that the new cluster might also need to adjust max_locks_per_transaction. I was able to remove split_old_dump() because pg_dumpall now produces a full global restore file and we do database dumps separately. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Attachment
pgsql-hackers by date: