Thread: How to get parallel restore in PG 8.4 to work?

How to get parallel restore in PG 8.4 to work?

From
henk de wit
Date:
Hi,
For performance reasons (obviously ;)) I'm experimenting with parallel restore in PG 8.4. I grabbed the latest source snapshot (of today, March 30) and compiled this with zlib support. I dumped a DB from PG 8.3.5 (using maximum compression). I got this message however:

postgres@mymachine:/home/henk/postgresql-8.4/bin$ time
./pg_restore -p 5434 -h localhost -U henk -d db_test -j 8 -Fc
/home/henk/test-databases/dumps/db_test.custom
pg_restore: [archiver] WARNING: archive is compressed, but this
installation does not support compression -- no data will be available
pg_restore: [archiver] cannot restore from compressed archive (compression
not supported in this installation)


So initially it seemed its only possible to do a pg_restore using the uncompressed pg_dump custom format. So I tried the uncompressed dump, but this too failed. This last part was a little problematic anyway, since pg_dump absolutely wants to read its input from a file and does not accept any input from stdin. I assume reading from a file is necessary for the multiple parallel processes to each read their own part of the file, something which might be difficult to do when reading from stdin.

Apart from the fact that it simply doesn't work for me at the moment, I see a major problem with this approach though. Dumping in the custom format (option -Fc) is far slower than dumping in the plain format. Even if the parallel restore would speed up things, then the combined time of a dump and restore would still be negatively affected when compared to doing a plain dump and restore. I'm aware of the fact that I might be hitting some bugs, as a development snapshot is by definition of course not stable. Also, perhaps I'm missing something.

My question is thus; could someone advise me how to get parallel restore to work and how to speed up a dump in the custom file format?

Many thanks in advance



See all the ways you can stay connected to friends and family

Re: How to get parallel restore in PG 8.4 to work?

From
Tom Lane
Date:
henk de wit <henk53602@hotmail.com> writes:
> For performance reasons (obviously ;)) I'm experimenting with parallel restore in PG 8.4. I grabbed the latest source
snapshot(of today, March 30) and compiled this with zlib support. I dumped a DB from PG 8.3.5 (using maximum
compression).I got this message however: 
> postgres@mymachine:/home/henk/postgresql-8.4/bin$ time
> ./pg_restore -p 5434 -h localhost -U henk -d db_test -j 8 -Fc
> /home/henk/test-databases/dumps/db_test.custom
> pg_restore: [archiver] WARNING: archive is compressed, but this
> installation does not support compression -- no data will be available
> pg_restore: [archiver] cannot restore from compressed archive (compression
> not supported in this installation)

As far as one can tell from here, you built *without* zlib support.
This is unrelated to parallel restore as such.

            regards, tom lane

Re: How to get parallel restore in PG 8.4 to work?

From
henk de wit
Date:
Hi,

> henk de wit <henk53602@hotmail.com> writes:
>> For performance reasons (obviously ;)) I'm experimenting with parallel restore in PG 8.4. [...] I got this message however:
>> [...]
>> pg_restore: [archiver] WARNING: archive is compressed, but this
>> installation does not support compression -- no data will be available

> As far as one can tell from here, you built *without* zlib support.
> This is unrelated to parallel restore as such.

I see. Thanks for the confirmation. I would have sworn I built with zlib support, but obviously I did something wrong. For some reason that I can't remember now, I did omit support for readline. Could that have anything to do with it, or are those completely unrelated?

To continue testing, I imported a PG 8.3 dump in the plain format into PG 8.4, dumped this again in the custom format and imported that again into PG 8.4 using the parallel restore feature. This proved to be very beneficial. Top shows that all the cores are being used:

./pg_restore -p 5433 -h localhost -d db_test -j 8 -Fc
/ssd/tmp/test_dump.custom

top - 11:33:37 up 1 day, 18:07,  5 users,  load average: 5.63, 2.12, 0.97
Tasks: 187 total,   7 running, 180 sleeping,   0 stopped,   0 zombie
Cpu0  : 91.7%us,  8.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si, 0.0%st
Cpu1  : 90.0%us,  9.3%sy,  0.0%ni,  0.7%id,  0.0%wa,  0.0%hi,  0.0%si, 0.0%st
Cpu2  : 81.5%us, 15.9%sy,  0.0%ni,  2.3%id,  0.0%wa,  0.0%hi,  0.3%si, 0.0%st
Cpu3  : 87.0%us, 10.3%sy,  0.0%ni,  2.3%id,  0.0%wa,  0.0%hi,  0.3%si, 0.0%st
Cpu4  : 91.4%us,  8.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.3%hi,  0.3%si, 0.0%st
Cpu5  : 66.8%us, 16.3%sy,  0.0%ni,  4.3%id, 11.0%wa,  0.0%hi,  1.7%si, 0.0%st
Cpu6  : 76.0%us, 12.7%sy,  0.0%ni,  0.0%id, 10.7%wa,  0.0%hi,  0.7%si, 0.0%st
Cpu7  : 97.3%us,  2.3%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si, 0.0%st
Mem:  33021204k total, 32861900k used,   159304k free,       40k buffers
Swap:  7811064k total,     2164k used,  7808900k free, 29166332k cached


The performance numbers are quite amazing. The dump is approximately 19GB in size and the filesystem I use is xfs on Debian Lenny. Using the normal restore (with a single process) the time it takes to do a full restore is 45 minutes, when using 8 processes this drops to just 14 minutes and 23 seconds. Using 16 processes it drops further to just 11 minutes and 46 seconds.

I still have some work to do to find out why dumping in the custom format is so much slower. Unfortunately I forgot to time this exactly, but my feeling was that it was 'very slow'. I'll try to get some exact numbers though.

Kind regards,
Henk





What can you do with the new Windows Live? Find out

Re: How to get parallel restore in PG 8.4 to work?

From
Tom Lane
Date:
henk de wit <henk53602@hotmail.com> writes:
> I still have some work to do to find out why dumping in the custom
> format is so much slower.

Offhand the only reason I can see for it to be much different from
plain-text output is that -Fc compresses by default.  If you don't
care about that, try -Fc -Z0.

            regards, tom lane

Re: How to get parallel restore in PG 8.4 to work?

From
henk de wit
Date:
>> I still have some work to do to find out why dumping in the custom
>> format is so much slower.
>
> Offhand the only reason I can see for it to be much different from
> plain-text output is that -Fc compresses by default. If you don't
> care about that, try -Fc -Z0.

Ok, I did some performance testing today and I appeared to be wrong after all. My apologies for the noise.

Here are some test results:

Scenarioxfsjfs patchedjfs
cat backup | gunzip | psql45 min--
pg_dump> hdd (uncompressed) (==pg_dump -Fp)--10 min 15 sec
pg_dump -Fc> hdd (uncompressed)10 min 20 sec10 min 21 sec10 min 28 sec
pg_dump -Fc | gzip> hdd11 min 20 sec11 min 25 sec12 min 04 sec
pg_restore 8 threads14 min 23 sec11 min 40 sec11 min 20 sec
pg_restore 16 threads11 min 46 sec12 min 40 sec12 min 33 sec
pg_restore 32 threads11 min 42 sec12 min 30 sec12 min 30 sec

As can be seen in the table (hope this renders correctly on the mailing list), there is barely a difference between a plain dump and a custom format dump. For who it concerns, xfs performance a little better than jfs here, but the difference is marginal. More on topic, beyond 16 processes there isn't any notable speed improvement for the parallel restore (as expected).

Kind regards,
Henk


See all the ways you can stay connected to friends and family