Re: pg_restore seems very slow - Mailing list pgsql-performance

From Adrian Myers
Subject Re: pg_restore seems very slow
Date
Msg-id CALAd6+0CS_uaknFwdwRC-dDovhcuErcPU2ygX9vgELDdLsZH4w@mail.gmail.com
Whole thread Raw
In response to Re: pg_restore seems very slow  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: pg_restore seems very slow  (Adam Scott <adam.c.scott@gmail.com>)
List pgsql-performance
Hi David,

Thank you for your reply. Yes, there is quite a lot of feedback in the terminal. I can see a small flurry of table operations followed by hours of table contents being printed, presumably as they are inserted. I didn't use the --verbose option, but it seems to be echoing everything it is doing.

I haven't seen any errors, and I was able to restore a couple very small tables successfully, so it seems like the process is valid. The problem is that pg_restore is running for extremely long periods of time on even modestly large tables and I can't tell if the optimizations I am trying, such as the -j concurrency option, are having any effect.

Thanks,
Adrian

On Wed, Jun 15, 2016 at 6:08 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Wed, Jun 15, 2016 at 6:00 PM, Adrian Myers <hadrianmyers@gmail.com> wrote:
This is my first post to the mailing list, so I apologize for any etiquette issues.

I have a few databases that I am trying to move from one system to another.  Both systems are running Windows 7 and Postgres 8.4, and they are pretty powerful machines (40-core Xeon workstations with decent hardware across the board). While the DBs vary in size, I'm working right now with one that is roughly 50 tables and probably 75M rows, and is about 300MB on disk when exported via pg_dump. 

I am exporting and restoring using these commands (on separate sytems):
pg_dump -F c mydb > mydb.dump
pg_restore -C -j 10 mydb.dump

The dump process runs in about a minute and seems fine. The restore process has already been running for around 7 hours.

Yesterday, I tried restoring a larger DB that is roughly triple the dimensions listed above, and it ran for over 16 hours without completing.

I followed the advice given at http://www.databasesoup.com/2014/09/settings-for-fast-pgrestore.html and set the conf settings as directed and restarted the server.

You can see in the command line that I am trying to use the -j parameter for parallelism, but I don't see much evidence of that in Task Manager. CPU load is consistently 1 or 2% and only a couple cores seem to be doing anything, there certainly aren't 10 cpu-bound cores. I'm not sure where to look for pg_restore's disk I/O, but there is an entry for pg_restore in Task Manager/Processes which shows almost no I/O Read Bytes and 0 I/O Write Bytes. Since that's just the parent process that might make sense but I don't see much activity elsewhere either.

Is there something simple that I am missing here? Does the -j flag not work in 8.4 and I should use --jobs? It just seems like none of the CPU or RAM usage I'd expect from this process are evident, it's taking many times longer than I would expect, and I don't know how to verify if the things I'm trying are working or not.

Any insight would be appreciated!


​Did any databases restore properly?

Are there any message in logs or on the terminal​?  You should add the "--verbose" option to your pg_restore command to help provoke this.

-C can be problematic at times.  Consider manually ensuring the desired target database exists and is setup correctly (matches the original) and then do a non-create restoration to it specifically.

-j should work fine in 8.4 (according to the docs)

You need to get to a point where you are seeing feedback from the pg_restore process.  Once you get it telling you what it is doing (or trying to do) then diagnosing can begin.

​David J.


pgsql-performance by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: pg_restore seems very slow
Next
From: Adam Scott
Date:
Subject: Re: pg_restore seems very slow