Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward - Mailing list pgsql-hackers
From | Chao Li |
---|---|
Subject | Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward |
Date | |
Msg-id | CBF70398-279C-4BA7-9FA1-2879A49F4EC9@gmail.com Whole thread Raw |
In response to | Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [PATCH v1] parallel pg_restore: avoid disk seeks when jumping short distance forward
Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward |
List | pgsql-hackers |
On Oct 12, 2025, at 09:25, Tom Lane <tgl@sss.pgh.pa.us> wrote:While playing around with the test cases for pg_dump compression,
I was startled to discover that the performance of compress_lz4's
"stream API" code is absolutely abysmal. Here is a simple test
case to demonstrate, using the regression database as test data:
$ pg_dump -Fd --compress=lz4 -f rlz4.dir regression
$ time pg_restore -f /dev/null rlz4.dir
real 0m0.023s
user 0m0.017s
sys 0m0.006s
So far so good, but now let's compress the toc.dat file:
$ lz4 -f -m --rm rlz4.dir/toc.dat
$ time pg_restore -f /dev/null rlz4.dir
real 0m1.335s
user 0m1.326s
sys 0m0.008s
Considering that lz4 prides itself on fast decompression speed,
that is not a sane result. Decompressing the file only requires
a couple ms on my machine:
$ time lz4cat rlz4.dir/toc.dat.lz4 >/dev/null
real 0m0.002s
user 0m0.000s
sys 0m0.002s
So on this example, pg_restore is something more than 600x slower
to read the TOC data than it ought to be.
I tested the patch on my MacBook M4 (Sequoia 15.6.1), compiled with -O2 optimization:
- For LZ4 performance improvement:
Without the fix (latest master):
```
chaol@ChaodeMacBook-Air tmp % pg_dump -Fd --compress=lz4 -f rlz4.dir regression
chaol@ChaodeMacBook-Air tmp % time pg_restore -f /dev/null rlz4.dir
pg_restore -f /dev/null rlz4.dir 0.03s user 0.03s system 11% cpu 0.463 total
chaol@ChaodeMacBook-Air tmp % lz4 -f -m --rm rlz4.dir/toc.dat
# It took 1.59s to restore from compressed data, much slower than from uncompressed data.
chaol@ChaodeMacBook-Air tmp % time pg_restore -f /dev/null rlz4.dir
pg_restore -f /dev/null rlz4.dir 1.59s user 0.02s system 97% cpu 1.653 total
With the fix:
```
chaol@ChaodeMacBook-Air tmp % pg_dump -Fd --compress=lz4 -f rlz4.dir regression
chaol@ChaodeMacBook-Air tmp % time pg_restore -f /dev/null rlz4.dir
pg_restore -f /dev/null rlz4.dir 0.02s user 0.03s system 16% cpu 0.305 total
chaol@ChaodeMacBook-Air tmp % lz4 -f -m --rm rlz4.dir/toc.dat
# Much faster !!!
chaol@ChaodeMacBook-Air tmp % time pg_restore -f /dev/null rlz4.dir
pg_restore -f /dev/null rlz4.dir 0.02s user 0.02s system 93% cpu 0.043 total
```
- For Gzip zero read bug:
Without the fix:
```
chaol@ChaodeMacBook-Air tmp % pg_dump -Fd --compress=gzip -f rgzip.dir regression
chaol@ChaodeMacBook-Air tmp % gzip rgzip.dir/toc.dat
# Failed
chaol@ChaodeMacBook-Air tmp % pg_restore -f /dev/null rgzip.dir
pg_restore: error: could not read from input file:
```
With the fix:
```
chaol@ChaodeMacBook-Air tmp % pg_dump -Fd --compress=gzip -f rgzip.dir regression
chaol@ChaodeMacBook-Air tmp % gzip rgzip.dir/toc.dat
# Ran successfully
chaol@ChaodeMacBook-Air tmp % pg_restore -f /dev/null rgzip.dir
```
I also reviewed the code change, basically LGTM. Just a couple of trivial comments:
1 - 0001
In WriteDataToArchiveLZ4()
```
+ required = LZ4F_compressBound(chunk, &state->prefs);
+ if (required > state->buflen - state->bufdata)
+ {
+ cs->writeF(AH, state->buffer, state->bufdata);
+ state->bufdata = 0;
+ }
```
And in EndCompressorLZ4()
```
+ required = LZ4F_compressBound(0, &state->prefs);
+ if (required > state->buflen - state->bufdata)
+ {
+ cs->writeF(AH, state->buffer, state->bufdata);
+ state->bufdata = 0;
+ }
```
These two code pieces are similar, only difference is the first parameter passed to LZ4F_compressBound().
I think we can create an inline function for it. But these are just 5 lines, I don’t have a strong option on that.
Same thing for the other code pieces in LZ4Stream_write() and LZ4Stream_close().
2 - 0003
```
/*
* LZ4 equivalent to feof() or gzeof(). Return true iff there is no
- * decompressed output in the overflow buffer and the end of the backing file
```
This doesn’t belong to the current patch. But “iff” seems a typo of “if”. You may fix it as you are touching this piece of code.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
HighGo Software Co., Ltd.
https://www.highgo.com/
pgsql-hackers by date: