Re: Add LZ4 compression in pg_dump - Mailing list pgsql-hackers
From | gkokolatos@pm.me |
---|---|
Subject | Re: Add LZ4 compression in pg_dump |
Date | |
Msg-id | srwGkjEwoQg3cdO9k1mMuIu9eEqkOA27Ikvf0cNR9mvBLVZPidKMk4fmKeVIsxVkgbVk2yeIL-Neyjb2Mkz9_ULGfk2qeYW8ZnM1iKIyklY=@pm.me Whole thread Raw |
In response to | Re: Add LZ4 compression in pg_dump (Justin Pryzby <pryzby@telsasoft.com>) |
Responses |
Re: Add LZ4 compression in pg_dump
(Justin Pryzby <pryzby@telsasoft.com>)
|
List | pgsql-hackers |
------- Original Message ------- On Monday, December 19th, 2022 at 6:27 PM, Justin Pryzby <pryzby@telsasoft.com> wrote: > > > On Mon, Dec 19, 2022 at 05:03:21PM +0000, gkokolatos@pm.me wrote: > > > > > 001 still doesn't compile on freebsd, and 002 doesn't compile on > > > > windows. Have you checked test results from cirrusci on your private > > > > github account ? > > > > There are still known gaps in 0002 and 0003, for example documentation, > > and I have not been focusing too much on those. You are right, it is helpful > > and kind to try to reduce the noise. The attached should have hopefully > > tackled the ci errors. > > > Yep. Are you using cirrusci under your github account ? Thank you. To be very honest, I am not using github exclusively to post patches. Sometimes I do, sometimes I do not. Is github a requirement? To answer your question, some of my github accounts are integrated with cirrusci, others are not. The current cfbot build is green for what is worth. https://cirrus-ci.com/build/5934319840002048 > > > > FYI, I have re-added an entry to the CF app to get some automated > > > coverage: > > > https://commitfest.postgresql.org/41/3571/ > > > > Much obliged. Should I change the state to "ready for review" when post a > > new version or should I leave that to the senior personnel? > > > It's better to update it to reflect what you think its current status > is. If you think it's ready for review. Thank you. > > > > > 002 breaks "pg_dump -Fc -Z2" because (I think) AllocateCompressor() > > > > doesn't store the passed-in compression_spec. > > > > I am afraid I have not been able to reproduce this error. I tried both > > debian and freebsd after I addressed the compilation warnings. Which > > error did you get? Is it still present in the attached? > > > It's not that there's an error - it's that compression isn't working. > > $ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z1 -Fp regression |wc -c > 659956 > $ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z2 -Fp regression |wc -c > 637192 > > $ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z1 -Fc regression |wc -c > 1954890 > $ ./tmp_install/usr/local/pgsql/bin/pg_dump -h /tmp -Z2 -Fc regression |wc -c > 1954890 > Thank you. Now I understand what you mean. Trying the same on top of v18-0003 on Ubuntu 22.04 yields: $ for compression in none gzip:1 gzip:6 gzip:9; do \ pg_dump --format=custom --compress="$compression" -f regression."$compression".dump -d regression; \ wc -c regression."$compression".dump; \ done; 14963753 regression.none.dump 3600183 regression.gzip:1.dump 3223755 regression.gzip:6.dump 3196903 regression.gzip:9.dump and on FreeBSD 13.1 $ for compression in none gzip:1 gzip:6 gzip:9; do \ pg_dump --format=custom --compress="$compression" -f regression."$compression".dump -d regression; \ wc -c regression."$compression".dump; \ done; 14828822 regression.none.dump 3584304 regression.gzip:1.dump 3208548 regression.gzip:6.dump 3182044 regression.gzip:9.dump Although there are some variations between the installations, within the same installation the size of the dump file is shrinking as expected. Investigating a bit further on the issue, you are correct in identifying an issue in v17. Up until v16, the compressor function looked like: +InitCompressorGzip(CompressorState *cs, int compressionLevel) +{ + GzipCompressorState *gzipcs; + + cs->readData = ReadDataFromArchiveGzip; + cs->writeData = WriteDataToArchiveGzip; + cs->end = EndCompressorGzip; + + gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState)); + gzipcs->compressionLevel = compressionLevel; V17 considered that more options could become available in the future and changed the signature of the relevant Init functions to: +InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec) +{ + GzipCompressorState *gzipcs; + + cs->readData = ReadDataFromArchiveGzip; + cs->writeData = WriteDataToArchiveGzip; + cs->end = EndCompressorGzip; + + gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState)); + V18 reinstated the assignment in similar fashion to InitCompressorNone and InitCompressorLz4: +void +InitCompressorGzip(CompressorState *cs, const pg_compress_specification compression_spec) +{ + GzipCompressorState *gzipcs; + + cs->readData = ReadDataFromArchiveGzip; + cs->writeData = WriteDataToArchiveGzip; + cs->end = EndCompressorGzip; + + cs->compression_spec = compression_spec; + + gzipcs = (GzipCompressorState *) pg_malloc0(sizeof(GzipCompressorState)); A test case can be added which performs a check similar to the loop above. Create a custom dump with the least and most compression for each method. Then verify that the output sizes differ as expected. This addition could become 0001 in the current series. Thoughts? Cheers, //Georgios > -- > Justin
pgsql-hackers by date: