Re: Teach pg_receivewal to use lz4 compression - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: Teach pg_receivewal to use lz4 compression |
Date | |
Msg-id | YOe5LouYuxXOrL0H@paquier.xyz Whole thread Raw |
In response to | Re: Teach pg_receivewal to use lz4 compression (gkokolatos@pm.me) |
Responses |
Re: Teach pg_receivewal to use lz4 compression
Re: Teach pg_receivewal to use lz4 compression |
List | pgsql-hackers |
On Thu, Jul 08, 2021 at 02:18:40PM +0000, gkokolatos@pm.me wrote: > please find v2 of the patch which tries to address the commends > received so far. Thanks! > Michael Paquier wrote: >> + system_or_bail('lz4', '-t', $lz4_wals[0]); >> I think that you should just drop this part of the test. The only >> part of LZ4 that we require to be present when Postgres is built with >> --with-lz4 is its library liblz4. Commands associated to it may not >> be around, causing this test to fail. The test checking that one .lz4 >> file has been created is good to have. It may be worth adding a test >> with a .lz4.partial segment generated and --endpos pointing to a LSN >> that does not finish the segment that gets switched. > > I humbly disagree with the need for the test. It is rather easily possible > to generate a file that can not be decoded, thus becoming useless. Having the > test will provide some guarantee for the fact. In the current patch, there > is code to find out if the program lz4 is available in the system. If it is > not available, then that specific test is skipped. The rest remains as it > were. Also `system_or_bail` is not used anymore in favour of the `system_log` > so that the test counted and the execution of tests continues upon failure. Check. I can see what you are using in the new patch. We could live with that. >> It seems to me that you are missing some logic in >> FindStreamingStart() to handle LZ4-compressed segments, in relation >> with IsCompressXLogFileName() and IsPartialCompressXLogFileName(). > > Very correct. The logic is now added. Given the lz4 api, one has to fill > in the uncompressed size during creation time. Then one can read the > headers and verify the expectations. Yeah, I read that as well with lz4 --list and the kind. That's weird compared to how ZLIB gives an easy access to it. We may want to do an effort and tell about more lz4 --content-size/--list, telling that we add the size in the segment compressed because we have to and LZ4 does not do it by default? >> Should we have more tests for ZLIB, while on it? That seems like a >> good addition as long as we can skip the tests conditionally when >> that's not supported. > > Agreed. Please allow me to provide a distinct patch for this code. Thanks. Looking forward to seeing it. That may be better on a separate thread, even if I digressed in this thread :) >> I think we can somehow use "acceleration" parameter of lz4 compression >> to map on compression level, It is not direct mapping but >> can't we create some internal mapping instead of completely ignoring >> this option for lz4, or we can provide another option for lz4? > > We can, though I am not in favour of doing so. There is seemingly > little benefit for added complexity. Agreed. >> What I think is important for the user when it comes to this >> option is the consistency of its naming across all the tools >> supporting it. pg_dump and pg_basebackup, where we could plug LZ4, >> already use most of the short options you could use for pg_receivewal, >> having only a long one gives a bit more flexibility. > > Done. * http://www.zlib.org/rfc-gzip.html. + * For lz4 compressed segments */ This comment is incomplete. +#define IsLZ4CompressXLogFileName(fname) \ + (strlen(fname) == XLOG_FNAME_LEN + strlen(".lz4") && \ + strspn(fname, "0123456789ABCDEF") == XLOG_FNAME_LEN && \ + strcmp((fname) + XLOG_FNAME_LEN, ".lz4") == 0) +#define IsLZ4PartialCompressXLogFileName(fname) \ + (strlen(fname) == XLOG_FNAME_LEN + strlen(".lz4.partial") && \ + strspn(fname, "0123456789ABCDEF") == XLOG_FNAME_LEN && \ + strcmp((fname) + XLOG_FNAME_LEN, ".lz4.partial") == 0) This is getting complicated. Would it be better to change this stuff and switch to a routine that checks if a segment has a valid name, is partial, and the type of compression that applied to it? It seems to me that we should group iszlibcompress and islz4compress together with the options available through compression_method. + if (compresslevel != 0) + { + if (compression_method == COMPRESSION_NONE) + { + compression_method = COMPRESSION_ZLIB; + } + if (compression_method != COMPRESSION_ZLIB) + { + pg_log_error("cannot use --compress when " + "--compression-method is not gzip"); + fprintf(stderr, _("Try \"%s --help\" for more information.\n"), + progname); + exit(1); + } + } For compatibility where --compress enforces the use of zlib that would work, but this needs a comment explaining the goal of this block. I am wondering if it would be better to break the will and just complain when specifying --compress without --compression-method though. That would cause compatibility issues, but this would make folks aware of the presence of LZ4, which does not sound bad to me either as ZLIB is slower than LZ4 on all fronts. + else if (compression_method == COMPRESSION_ZLIB) + { + pg_log_error("cannot use --compression-method gzip when " + "--compression is 0"); + fprintf(stderr, _("Try \"%s --help\" for more information.\n"), + progname); + exit(1); + } Hmm. It would be more natural to enforce a default compression level in this case? The user is asking for a compression with zlib here. + my $lz4 = $ENV{LZ4}; [...] + # Verify that the stored file is readable if program lz4 is available + skip "program lz4 is not found in your system", 1 + if (!defined $lz4 || $lz4 eq ''); Okay, this is acceptable. Didn't know the existing trick with TAR either. + /* + * XXX: this is crap... lz4preferences now does show the uncompressed + * size via lz4 --list <filename> but the compression goes down the + * window... also it is not very helpfull to have it at the startm, does + * it? + */ What do you mean here by "the compression goes out the window"? -- Michael
Attachment
pgsql-hackers by date: