Thread: pg_restore >10million large objects

pg_restore >10million large objects

From

Mike Williams

Date:

23 December 2013, 15:19:15

Hi all,

There have been some questions about pg_dump and huge numbers of large objects
recently. I have a query about the opposite.
How can restoring a database with a lot of large objects run faster?

My database has a relatively piddling 13 million large objects, so dumping it
isn't a problem.
Restoring it is a problem though.
This is for a migration from 8.4 to 9.3. The dump is taken using pg_dump from
9.3.


I've run a test on a significantly smaller test system.
~4GB overall, and 1.1 million large objects. It took 2 hours, give or take.
The server it's on isn't especially fast though.
It seems that each "SELECT pg_catalog.lo_create('xxxxx');" is run
independently and sequentially, despite having --jobs=8 specified.


Is there any magic incantation, or animal sacrifice, I can make to get those
lo_create() calls run in parallel?
Our 9.3 production servers have 12 cores (plus HT) and SSDs, so can do many
queries at the same time.


Thanks

--
Mike Williams

Re: pg_restore >10million large objects

From

bricklen

Date:

23 December 2013, 15:54:54

On Mon, Dec 23, 2013 at 7:19 AM, Mike Williams <mike.williams@comodo.com> wrote:

How can restoring a database with a lot of large objects run faster?

It seems that each "SELECT pg_catalog.lo_create('xxxxx');" is run
independently and sequentially, despite having --jobs=8 specified.

I don't have an answer for why the restore seems to be serialized, but have you considered creating your pg_dump (-Fc) but exclude all the lobs, then dump or COPY the large objects out separately which you can them import with a manually-specified number of processes? By "manually specified", I mean execute a number of COPY FROM commands using separate threads.