Thread: zstd compression for pg_dump
This is a draft patch - review is welcome and would help to get this ready to be considererd for v16, if desired. I'm going to add this thread to the old CF entry. https://commitfest.postgresql.org/31/2888/ -- Justin
Attachment
On Fri, Feb 24, 2023 at 01:18:40PM -0600, Justin Pryzby wrote: > This is a draft patch - review is welcome and would help to get this > ready to be considererd for v16, if desired. > > I'm going to add this thread to the old CF entry. > https://commitfest.postgresql.org/31/2888/ Patch 0003 adds support for the --long option of zstd, meaning that it "enables long distance matching with #windowLog". What's the benefit of that when it is applied to dumps and base backup contents? -- Michael
Attachment
On Sat, Feb 25, 2023 at 01:44:36PM +0900, Michael Paquier wrote: > On Fri, Feb 24, 2023 at 01:18:40PM -0600, Justin Pryzby wrote: > > This is a draft patch - review is welcome and would help to get this > > ready to be considererd for v16, if desired. > > > > I'm going to add this thread to the old CF entry. > > https://commitfest.postgresql.org/31/2888/ > > Patch 0003 adds support for the --long option of zstd, meaning that it > "enables long distance matching with #windowLog". What's the benefit > of that when it is applied to dumps and base backup contents? It (can) makes it smaller. + The <literal>long</literal> keyword enables long-distance matching + mode, for improved compression ratio, at the expense of higher memory + use. Long-distance mode is supported only for + With zstd compression, <literal>long</literal> mode may allow dumps + to be significantly smaller, but it might not reduce the size of + custom or directory format dumps, whose fields are separately compressed. Note that I included that here as 003, but I also have an pre-existing patch for adding that just to basebackup. -- Justin
On 2/24/23 20:18, Justin Pryzby wrote: > This is a draft patch - review is welcome and would help to get this > ready to be considererd for v16, if desired. > > I'm going to add this thread to the old CF entry. > https://commitfest.postgresql.org/31/2888/ > Thanks. Sadly cfbot is unhappy - the windows and cplusplus builds failed because of some issue in pg_backup_archiver.h. But it's a bit bizarre because the patch does not modify that file at all ... regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Feb 25, 2023, at 7:31 AM, Tomas Vondra wrote:
On 2/24/23 20:18, Justin Pryzby wrote:> This is a draft patch - review is welcome and would help to get this> ready to be considererd for v16, if desired.>> I'm going to add this thread to the old CF entry.>Thanks. Sadly cfbot is unhappy - the windows and cplusplus builds failedbecause of some issue in pg_backup_archiver.h. But it's a bit bizarrebecause the patch does not modify that file at all ...
cpluspluscheck says
# pg_dump is not C++-clean because it uses "public" and "namespace"
# as field names, which is unfortunate but we won't change it now.
Hence, the patch should exclude the new header file from it.
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -153,6 +153,7 @@ do
test "$f" = src/bin/pg_dump/compress_gzip.h && continue
test "$f" = src/bin/pg_dump/compress_io.h && continue
test "$f" = src/bin/pg_dump/compress_lz4.h && continue
+ test "$f" = src/bin/pg_dump/compress_zstd.h && continue
test "$f" = src/bin/pg_dump/compress_none.h && continue
test "$f" = src/bin/pg_dump/parallel.h && continue
test "$f" = src/bin/pg_dump/pg_backup_archiver.h && continue
On Fri, Feb 24, 2023 at 01:18:40PM -0600, Justin Pryzby wrote: > This is a draft patch - review is welcome and would help to get this > ready to be considererd for v16, if desired. > > I'm going to add this thread to the old CF entry. > https://commitfest.postgresql.org/31/2888/ This resolves cfbot warnings: windows and cppcheck. And refactors zstd routines. And updates docs. And includes some fixes for earlier patches that these patches conflicts with/depends on.
Attachment
On Sat, Feb 25, 2023 at 5:22 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > This resolves cfbot warnings: windows and cppcheck. > And refactors zstd routines. > And updates docs. > And includes some fixes for earlier patches that these patches conflicts > with/depends on. This'll need a rebase (cfbot took a while to catch up). The patchset includes basebackup modifications, which are part of a different CF entry; was that intended? I tried this on a local, 3.5GB, mostly-text table (from the UK Price Paid dataset [1]) and the comparison against the other methods was impressive. (I'm no good at constructing compression benchmarks, so this is a super naive setup. Client's on the same laptop as the server.) $ time ./src/bin/pg_dump/pg_dump -d postgres -t pp_complete -Z zstd > /tmp/zstd.dump real 1m17.632s user 0m35.521s sys 0m2.683s $ time ./\src/bin/pg_dump/pg_dump -d postgres -t pp_complete -Z lz4 > /tmp/lz4.dump real 1m13.125s user 0m19.795s sys 0m3.370s $ time ./\src/bin/pg_dump/pg_dump -d postgres -t pp_complete -Z gzip > /tmp/gzip.dump real 2m24.523s user 2m22.114s sys 0m1.848s $ ls -l /tmp/*.dump -rw-rw-r-- 1 jacob jacob 1331493925 Mar 3 09:45 /tmp/gzip.dump -rw-rw-r-- 1 jacob jacob 2125998939 Mar 3 09:42 /tmp/lz4.dump -rw-rw-r-- 1 jacob jacob 1215834718 Mar 3 09:40 /tmp/zstd.dump Default gzip was the only method that bottlenecked on pg_dump rather than the server, and default zstd outcompressed it at a fraction of the CPU time. So, naively, this looks really good. With this particular dataset, I don't see much improvement with zstd:long. (At nearly double the CPU time, I get a <1% improvement in compression size.) I assume it's heavily data dependent, but from the notes on --long [2] it seems like they expect you to play around with the window size to further tailor it to your data. Does it make sense to provide the long option without the windowLog parameter? Thanks, --Jacob [1] https://landregistry.data.gov.uk/ [2] https://github.com/facebook/zstd/releases/tag/v1.3.2
On Fri, Mar 03, 2023 at 10:32:53AM -0800, Jacob Champion wrote: > On Sat, Feb 25, 2023 at 5:22 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > This resolves cfbot warnings: windows and cppcheck. > > And refactors zstd routines. > > And updates docs. > > And includes some fixes for earlier patches that these patches conflicts > > with/depends on. > > This'll need a rebase (cfbot took a while to catch up). Soon. > The patchset includes basebackup modifications, which are part of a > different CF entry; was that intended? Yes, it's intentional - if zstd:long mode were to be merged first, then this patch should include long mode from the start. Or, if pgdump+zstd were merged first, then long mode could be added to both places. > I tried this on a local, 3.5GB, mostly-text table (from the UK Price Thanks for looking. If your zstd library is compiled with thread support, could you also try with :workers=N ? I believe this is working correctly, but I'm going to ask for help verifying that... It'd be especially useful to test under windows, where pgdump/restore use threads instead of forking... If you have a windows environment but not set up for development, I think it's possible to get cirrusci to compile a patch for you and then retrieve the binaries provided as an "artifact" (credit/blame for this idea should be directed to Thomas Munro). > With this particular dataset, I don't see much improvement with > zstd:long. Yeah. I this could be because either 1) you already got very good comprssion without looking at more data; and/or 2) the neighboring data is already very similar, maybe equally or more similar, than the further data, from which there's nothing to gain. > (At nearly double the CPU time, I get a <1% improvement in > compression size.) I assume it's heavily data dependent, but from the > notes on --long [2] it seems like they expect you to play around with > the window size to further tailor it to your data. Does it make sense > to provide the long option without the windowLog parameter? I don't want to start exposing lots of fine-granined parameters at this point. In the immediate case, it looks like it may require more than just adding another parameter: Note: If windowLog is set to larger than 27, --long=windowLog or --memory=windowSize needs to be passed to the decompressor. -- Justin
On Fri, Mar 3, 2023 at 10:55 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > Thanks for looking. If your zstd library is compiled with thread > support, could you also try with :workers=N ? I believe this is working > correctly, but I'm going to ask for help verifying that... Unfortunately not (Ubuntu 20.04): pg_dump: error: could not set compression parameter: Unsupported parameter But that lets me review the error! I think these error messages should say which options caused them. > It'd be especially useful to test under windows, where pgdump/restore > use threads instead of forking... If you have a windows environment but > not set up for development, I think it's possible to get cirrusci to > compile a patch for you and then retrieve the binaries provided as an > "artifact" (credit/blame for this idea should be directed to Thomas > Munro). I should be able to do that next week. > > With this particular dataset, I don't see much improvement with > > zstd:long. > > Yeah. I this could be because either 1) you already got very good > comprssion without looking at more data; and/or 2) the neighboring data > is already very similar, maybe equally or more similar, than the further > data, from which there's nothing to gain. What kinds of improvements do you see with your setup? I'm wondering when we would suggest that people use it. > I don't want to start exposing lots of fine-granined parameters at this > point. In the immediate case, it looks like it may require more than > just adding another parameter: > > Note: If windowLog is set to larger than 27, > --long=windowLog or --memory=windowSize needs to be passed to the > decompressor. Hm. That would complicate things. Thanks, --Jacob
On Fri, Mar 03, 2023 at 01:38:05PM -0800, Jacob Champion wrote: > > > With this particular dataset, I don't see much improvement with > > > zstd:long. > > > > Yeah. I this could be because either 1) you already got very good > > comprssion without looking at more data; and/or 2) the neighboring data > > is already very similar, maybe equally or more similar, than the further > > data, from which there's nothing to gain. > > What kinds of improvements do you see with your setup? I'm wondering > when we would suggest that people use it. On customer data, I see small improvements - below 10%. But on my first two tries, I made synthetic data sets where it's a lot: $ ./src/bin/pg_dump/pg_dump -d pryzbyj -Fp -Z zstd:long |wc -c 286107 $ ./src/bin/pg_dump/pg_dump -d pryzbyj -Fp -Z zstd:long=0 |wc -c 1709695 That's just 6 identical tables like: pryzbyj=# CREATE TABLE t1 AS SELECT generate_series(1,999999); In this case, "custom" format doesn't see that benefit, because the greatest similarity is across tables, which don't share compressor state. But I think the note that I wrote in the docs about that should be removed - custom format could see a big benefit, as long as the table is big enough, and there's more similarity/repetition at longer distances. Here's one where custom format *does* benefit, due to long-distance repetition within a single table. The data is contrived, but the schema of ID => data is not. What's notable isn't how compressible the data is, but how much *more* compressible it is with long-distance matching. pryzbyj=# CREATE TABLE t1 AS SELECT i,array_agg(j) FROM generate_series(1,444)i,generate_series(1,99999)j GROUP BY 1; $ ./src/bin/pg_dump/pg_dump -d pryzbyj -Fc -Z zstd:long=1 |wc -c 82023 $ ./src/bin/pg_dump/pg_dump -d pryzbyj -Fc -Z zstd:long=0 |wc -c 1048267 -- Justin
On Sat, Feb 25, 2023 at 07:22:27PM -0600, Justin Pryzby wrote: > On Fri, Feb 24, 2023 at 01:18:40PM -0600, Justin Pryzby wrote: > > This is a draft patch - review is welcome and would help to get this > > ready to be considererd for v16, if desired. > > > > I'm going to add this thread to the old CF entry. > > https://commitfest.postgresql.org/31/2888/ > > This resolves cfbot warnings: windows and cppcheck. > And refactors zstd routines. > And updates docs. > And includes some fixes for earlier patches that these patches conflicts > with/depends on. This rebases over the TAP and doc fixes to LZ4. And adds necessary ENV to makefile and meson. And adds an annoying boilerplate header. And removes supports_compression(), which is what I think Tomas meant when referring to "annoying unsupported cases". And updates zstd.c: fix an off-by-one, allocate in init depending on readF/writeF, do not reset the input buffer on each iteration, and show parameter name in errors. I'd appreciate help checking that this is doing the right things and works correctly with zstd threaded workers. The zstd API says: "use one different context per thread for parallel execution" and "For parallel execution, use one separate ZSTD_CStream per thread". https://github.com/facebook/zstd/blob/dev/lib/zstd.h I understand that to mean that, if pg_dump *itself* were using threads, then each thread would need to call ZSTD_createCStream(). pg_dump isn't threaded, so there's nothing special needed, right? Except that, under windows, pg_dump -Fd -j actually uses threads instead of forking. I *think* that's still safe, since the pgdump threads are created *before* calling zstd functions (see _PrintTocData and _StartData of the custom and directory formats), so it happens naturally that there's a separate zstd stream for each thread of pgdump. -- Justin
Attachment
On Sat, Mar 4, 2023 at 8:57 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > pryzbyj=# CREATE TABLE t1 AS SELECT i,array_agg(j) FROM generate_series(1,444)i,generate_series(1,99999)j GROUP BY 1; > $ ./src/bin/pg_dump/pg_dump -d pryzbyj -Fc -Z zstd:long=1 |wc -c > 82023 > $ ./src/bin/pg_dump/pg_dump -d pryzbyj -Fc -Z zstd:long=0 |wc -c > 1048267 Nice! I did some smoke testing against zstd's GitHub release on Windows. To build against it, I had to construct an import library, and put that and the DLL into the `lib` folder expected by the MSVC scripts... which makes me wonder if I've chosen a harder way than necessary? Parallel zstd dumps seem to work as expected, in that the resulting pg_restore output is identical to uncompressed dumps and nothing explodes. I haven't inspected the threading implementation for safety yet, as you mentioned. And I still wasn't able to test :workers, since it looks like the official libzstd for Windows isn't built for multithreading. That'll be another day's project. --Jacob
Hi, This'll need another rebase over the meson ICU changes. On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion <jchampion@timescale.com> wrote: > I did some smoke testing against zstd's GitHub release on Windows. To > build against it, I had to construct an import library, and put that > and the DLL into the `lib` folder expected by the MSVC scripts... > which makes me wonder if I've chosen a harder way than necessary? A meson wrap made this much easier! It looks like pg_dump's meson.build is missing dependencies on zstd (meson couldn't find the headers in the subproject without them). > Parallel zstd dumps seem to work as expected, in that the resulting > pg_restore output is identical to uncompressed dumps and nothing > explodes. I haven't inspected the threading implementation for safety > yet, as you mentioned. Hm. Best I can tell, the CloneArchive() machinery is supposed to be handling safety for this, by isolating each thread's state. I don't feel comfortable pronouncing this new addition safe or not, because I'm not sure I understand what the comments in the format-specific _Clone() callbacks are saying yet. > And I still wasn't able to test :workers, since > it looks like the official libzstd for Windows isn't built for > multithreading. That'll be another day's project. The wrapped installation enabled threading too, so I was able to try with :workers=8. Everything seems to work, but I didn't have a dataset that showed speed improvements at the time. It did seem to affect the overall compressibility negatively -- which makes sense, I think, assuming each thread is looking at a separate and/or smaller window. On to code (not a complete review): > if (hasSuffix(fname, ".gz")) > compression_spec.algorithm = PG_COMPRESSION_GZIP; > else > { > bool exists; > > exists = (stat(path, &st) == 0); > /* avoid unused warning if it is not built with compression */ > if (exists) > compression_spec.algorithm = PG_COMPRESSION_NONE; > -#ifdef HAVE_LIBZ > - if (!exists) > - { > - free_keep_errno(fname); > - fname = psprintf("%s.gz", path); > - exists = (stat(fname, &st) == 0); > - > - if (exists) > - compression_spec.algorithm = PG_COMPRESSION_GZIP; > - } > -#endif > -#ifdef USE_LZ4 > - if (!exists) > - { > - free_keep_errno(fname); > - fname = psprintf("%s.lz4", path); > - exists = (stat(fname, &st) == 0); > - > - if (exists) > - compression_spec.algorithm = PG_COMPRESSION_LZ4; > - } > -#endif > + else if (check_compressed_file(path, &fname, "gz")) > + compression_spec.algorithm = PG_COMPRESSION_GZIP; > + else if (check_compressed_file(path, &fname, "lz4")) > + compression_spec.algorithm = PG_COMPRESSION_LZ4; > + else if (check_compressed_file(path, &fname, "zst")) > + compression_spec.algorithm = PG_COMPRESSION_ZSTD; > } This function lost some coherence, I think. Should there be a hasSuffix check at the top for ".zstd" (and, for that matter, ".lz4")? And the comment references an unused warning, which is only possible with the #ifdef blocks that were removed. I'm a little suspicious of the replacement of supports_compression() with parse_compress_specification(). For example: > - errmsg = supports_compression(AH->compression_spec); > - if (errmsg) > + parse_compress_specification(AH->compression_spec.algorithm, > + NULL, &compress_spec); > + if (compress_spec.parse_error != NULL) > { > pg_log_warning("archive is compressed, but this installation does not support compression (%s > - errmsg); > - pg_free(errmsg); > + compress_spec.parse_error); > + pg_free(compress_spec.parse_error); > } The top-level error here is "does not support compression", but wouldn't a bad specification option with a supported compression method trip this path too? > +static void > +ZSTD_CCtx_setParam_or_die(ZSTD_CStream *cstream, > + ZSTD_cParameter param, int value, char *paramname) IMO we should avoid stepping on the ZSTD_ namespace with our own internal function names. > + if (cs->readF != NULL) > + { > + zstdcs->dstream = ZSTD_createDStream(); > + if (zstdcs->dstream == NULL) > + pg_fatal("could not initialize compression library"); > + > + zstdcs->input.size = ZSTD_DStreamInSize(); > + zstdcs->input.src = pg_malloc(zstdcs->input.size); > + > + zstdcs->output.size = ZSTD_DStreamOutSize(); > + zstdcs->output.dst = pg_malloc(zstdcs->output.size + 1); > + } > + > + if (cs->writeF != NULL) > + { > + zstdcs->cstream = ZstdCStreamParams(cs->compression_spec); > + > + zstdcs->output.size = ZSTD_CStreamOutSize(); > + zstdcs->output.dst = pg_malloc(zstdcs->output.size); > + zstdcs->output.pos = 0; > + } This seems to suggest that both cs->readF and cs->writeF could be set, but in that case, the output buffer gets reallocated. I was curious about the extra byte allocated in the decompression case. I see that ReadDataFromArchiveZstd() is null-terminating the buffer before handing it to ahwrite(), but why does it need to do that? > +static const char * > +Zstd_get_error(CompressFileHandle *CFH) > +{ > + return strerror(errno); > +} Seems like this should be using the zstderror stored in the handle? In ReadDataFromArchiveZstd(): > + size_t input_allocated_size = ZSTD_DStreamInSize(); > + size_t res; > + > + for (;;) > + { > + size_t cnt; > + > + /* > + * Read compressed data. Note that readF can resize the buffer; the > + * new size is tracked and used for future loops. > + */ > + input->size = input_allocated_size; > + cnt = cs->readF(AH, (char **) unconstify(void **, &input->src), &input->size); > + input_allocated_size = input->size; > + input->size = cnt; This is pretty complex for what it's doing. I'm a little worried that we let the reallocated buffer escape to the caller while losing track of how big it is. I think that works today, since there's only ever one call per handle, but any future refactoring that allowed cs->readData() to be called more than once would subtly break this code. In ZstdWriteCommon(): > + /* > + * Extra paranoia: avoid zero-length chunks, since a zero length chunk > + * is the EOF marker in the custom format. This should never happen > + * but... > + */ > + if (output->pos > 0) > + cs->writeF(AH, output->dst, output->pos); > + > + output->pos = 0; Elsewhere, output->pos is set to zero before compressing, but here we do it after, which I think leads to subtle differences in the function preconditions. If that's an intentional difference, can the reason be called out in a comment? --Jacob
On Fri, Mar 10, 2023 at 12:48:13PM -0800, Jacob Champion wrote: > On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion <jchampion@timescale.com> wrote: > > I did some smoke testing against zstd's GitHub release on Windows. To > > build against it, I had to construct an import library, and put that > > and the DLL into the `lib` folder expected by the MSVC scripts... > > which makes me wonder if I've chosen a harder way than necessary? > > It looks like pg_dump's meson.build is missing dependencies on zstd > (meson couldn't find the headers in the subproject without them). I saw that this was added for LZ4, but I hadn't added it for zstd since I didn't run into an issue without it. Could you check that what I've added works for your case ? > > Parallel zstd dumps seem to work as expected, in that the resulting > > pg_restore output is identical to uncompressed dumps and nothing > > explodes. I haven't inspected the threading implementation for safety > > yet, as you mentioned. > > Hm. Best I can tell, the CloneArchive() machinery is supposed to be > handling safety for this, by isolating each thread's state. I don't feel > comfortable pronouncing this new addition safe or not, because I'm not > sure I understand what the comments in the format-specific _Clone() > callbacks are saying yet. My line of reasoning for unix is that pg_dump forks before any calls to zstd. Nothing zstd does ought to affect the pg_dump layer. But that doesn't apply to pg_dump under windows. This is an opened question. If there's no solid answer, I could disable/ignore the option (maybe only under windows). > On to code (not a complete review): > > > if (hasSuffix(fname, ".gz")) > > compression_spec.algorithm = PG_COMPRESSION_GZIP; > > else > > { > > bool exists; > > > > exists = (stat(path, &st) == 0); > > /* avoid unused warning if it is not built with compression */ > > if (exists) > > compression_spec.algorithm = PG_COMPRESSION_NONE; > > -#ifdef HAVE_LIBZ > > - if (!exists) > > - { > > - free_keep_errno(fname); > > - fname = psprintf("%s.gz", path); > > - exists = (stat(fname, &st) == 0); > > - > > - if (exists) > > - compression_spec.algorithm = PG_COMPRESSION_GZIP; > > - } > > -#endif > > -#ifdef USE_LZ4 > > - if (!exists) > > - { > > - free_keep_errno(fname); > > - fname = psprintf("%s.lz4", path); > > - exists = (stat(fname, &st) == 0); > > - > > - if (exists) > > - compression_spec.algorithm = PG_COMPRESSION_LZ4; > > - } > > -#endif > > + else if (check_compressed_file(path, &fname, "gz")) > > + compression_spec.algorithm = PG_COMPRESSION_GZIP; > > + else if (check_compressed_file(path, &fname, "lz4")) > > + compression_spec.algorithm = PG_COMPRESSION_LZ4; > > + else if (check_compressed_file(path, &fname, "zst")) > > + compression_spec.algorithm = PG_COMPRESSION_ZSTD; > > } > > This function lost some coherence, I think. Should there be a hasSuffix > check at the top for ".zstd" (and, for that matter, ".lz4")? The function is first checking if it was passed a filename which already has a suffix. And if not, it searches through a list of suffixes, testing for an existing file with each suffix. The search with stat() doesn't happen if it has a suffix. I'm having trouble seeing how the hasSuffix() branch isn't dead code. Another opened question. > I'm a little suspicious of the replacement of supports_compression() > with parse_compress_specification(). For example: > > > - errmsg = supports_compression(AH->compression_spec); > > - if (errmsg) > > + parse_compress_specification(AH->compression_spec.algorithm, > > + NULL, &compress_spec); > > + if (compress_spec.parse_error != NULL) > > { > > pg_log_warning("archive is compressed, but this installation does not support compression (%s > > - errmsg); > > - pg_free(errmsg); > > + compress_spec.parse_error); > > + pg_free(compress_spec.parse_error); > > } > > The top-level error here is "does not support compression", but wouldn't > a bad specification option with a supported compression method trip this > path too? No - since the 2nd argument is passed as NULL, it just checks whether the compression is supported. Maybe there ought to be a more direct/clean way to do it. But up to now evidently nobody needed to do that. > > +static void > > +ZSTD_CCtx_setParam_or_die(ZSTD_CStream *cstream, > > + ZSTD_cParameter param, int value, char *paramname) > > IMO we should avoid stepping on the ZSTD_ namespace with our own > internal function names. done > > + if (cs->readF != NULL) > > + > > + if (cs->writeF != NULL) > > This seems to suggest that both cs->readF and cs->writeF could be set, > but in that case, the output buffer gets reallocated. I put back an assertion that exactly one of them was set, since that's true of how it currently works. > I was curious about the extra byte allocated in the decompression case. > I see that ReadDataFromArchiveZstd() is null-terminating the buffer > before handing it to ahwrite(), but why does it need to do that? I was trying to figure that out, too. I think the unterminated case might be for ExecuteSqlCommandBuf(), and that may only (have) been needed to allow pg_restore to handle ancient/development versions of pg_dump... It's not currently hit. https://coverage.postgresql.org/src/bin/pg_dump/pg_backup_db.c.gcov.html#470 I found that the terminator was added for the uncompressed case was added at e8f69be05 and removed in bf9aa490d. > > +Zstd_get_error(CompressFileHandle *CFH) > > Seems like this should be using the zstderror stored in the handle? Yes - I'd already addressed that locally. > In ReadDataFromArchiveZstd(): > > > + * Read compressed data. Note that readF can resize the buffer; the > > + * new size is tracked and used for future loops. > This is pretty complex for what it's doing. I'm a little worried that we > let the reallocated buffer escape to the caller while losing track of > how big it is. I think that works today, since there's only ever one > call per handle, but any future refactoring that allowed cs->readData() > to be called more than once would subtly break this code. Note that nothing bad happens if we lose track of how big it is (well, assuming that readF doesn't *shrink* the buffer). The previous patch version didn't keep track of its new size, and the only consequence is that readF() might re-resize it again on a future iteration, even if it was already sufficiently large. When I originally wrote it (and up until that patch version), I left this as an XXX comment about reusing the resized buffer. But it seemed easy enough to fix so I did. > In ZstdWriteCommon(): > > Elsewhere, output->pos is set to zero before compressing, but here we do > it after, which I think leads to subtle differences in the function > preconditions. If that's an intentional difference, can the reason be > called out in a comment? It's not deliberate. I think it had no effect, but changed - thanks. -- Justin
Attachment
On 3/16/23 05:50, Justin Pryzby wrote: > On Fri, Mar 10, 2023 at 12:48:13PM -0800, Jacob Champion wrote: >> On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion <jchampion@timescale.com> wrote: >>> I did some smoke testing against zstd's GitHub release on Windows. To >>> build against it, I had to construct an import library, and put that >>> and the DLL into the `lib` folder expected by the MSVC scripts... >>> which makes me wonder if I've chosen a harder way than necessary? >> >> It looks like pg_dump's meson.build is missing dependencies on zstd >> (meson couldn't find the headers in the subproject without them). > > I saw that this was added for LZ4, but I hadn't added it for zstd since > I didn't run into an issue without it. Could you check that what I've > added works for your case ? > >>> Parallel zstd dumps seem to work as expected, in that the resulting >>> pg_restore output is identical to uncompressed dumps and nothing >>> explodes. I haven't inspected the threading implementation for safety >>> yet, as you mentioned. >> >> Hm. Best I can tell, the CloneArchive() machinery is supposed to be >> handling safety for this, by isolating each thread's state. I don't feel >> comfortable pronouncing this new addition safe or not, because I'm not >> sure I understand what the comments in the format-specific _Clone() >> callbacks are saying yet. > > My line of reasoning for unix is that pg_dump forks before any calls to > zstd. Nothing zstd does ought to affect the pg_dump layer. But that > doesn't apply to pg_dump under windows. This is an opened question. If > there's no solid answer, I could disable/ignore the option (maybe only > under windows). > I may be missing something, but why would the patch affect this? Why would it even affect safety of the parallel dump? And I don't see any changes to the clone stuff ... >> On to code (not a complete review): >> >>> if (hasSuffix(fname, ".gz")) >>> compression_spec.algorithm = PG_COMPRESSION_GZIP; >>> else >>> { >>> bool exists; >>> >>> exists = (stat(path, &st) == 0); >>> /* avoid unused warning if it is not built with compression */ >>> if (exists) >>> compression_spec.algorithm = PG_COMPRESSION_NONE; >>> -#ifdef HAVE_LIBZ >>> - if (!exists) >>> - { >>> - free_keep_errno(fname); >>> - fname = psprintf("%s.gz", path); >>> - exists = (stat(fname, &st) == 0); >>> - >>> - if (exists) >>> - compression_spec.algorithm = PG_COMPRESSION_GZIP; >>> - } >>> -#endif >>> -#ifdef USE_LZ4 >>> - if (!exists) >>> - { >>> - free_keep_errno(fname); >>> - fname = psprintf("%s.lz4", path); >>> - exists = (stat(fname, &st) == 0); >>> - >>> - if (exists) >>> - compression_spec.algorithm = PG_COMPRESSION_LZ4; >>> - } >>> -#endif >>> + else if (check_compressed_file(path, &fname, "gz")) >>> + compression_spec.algorithm = PG_COMPRESSION_GZIP; >>> + else if (check_compressed_file(path, &fname, "lz4")) >>> + compression_spec.algorithm = PG_COMPRESSION_LZ4; >>> + else if (check_compressed_file(path, &fname, "zst")) >>> + compression_spec.algorithm = PG_COMPRESSION_ZSTD; >>> } >> >> This function lost some coherence, I think. Should there be a hasSuffix >> check at the top for ".zstd" (and, for that matter, ".lz4")? > This was discussed in the lz4 thread a couple days, and I think there should be hasSuffix() cases for lz4/zstd too, not just for .gz. > The function is first checking if it was passed a filename which already > has a suffix. And if not, it searches through a list of suffixes, > testing for an existing file with each suffix. The search with stat() > doesn't happen if it has a suffix. I'm having trouble seeing how the > hasSuffix() branch isn't dead code. Another opened question. > AFAICS it's done this way because of this comment in pg_backup_directory * ... * ".gz" suffix is added to the filenames. The TOC files are never * compressed by pg_dump, however they are accepted with the .gz suffix * too, in case the user has manually compressed them with 'gzip'. I haven't tried, but I believe that if you manually compress the directory, it may hit this branch. And IMO if we support that for gzip, the other compression methods should do that too for consistency. In any case, it's a tiny amount of code and I don't feel like ripping that out when it might break some currently supported use case. >> I'm a little suspicious of the replacement of supports_compression() >> with parse_compress_specification(). For example: >> >>> - errmsg = supports_compression(AH->compression_spec); >>> - if (errmsg) >>> + parse_compress_specification(AH->compression_spec.algorithm, >>> + NULL, &compress_spec); >>> + if (compress_spec.parse_error != NULL) >>> { >>> pg_log_warning("archive is compressed, but this installation does not support compression (%s >>> - errmsg); >>> - pg_free(errmsg); >>> + compress_spec.parse_error); >>> + pg_free(compress_spec.parse_error); >>> } >> >> The top-level error here is "does not support compression", but wouldn't >> a bad specification option with a supported compression method trip this >> path too? > > No - since the 2nd argument is passed as NULL, it just checks whether > the compression is supported. Maybe there ought to be a more > direct/clean way to do it. But up to now evidently nobody needed to do > that. > I don't think the patch can use parse_compress_specification() instead of replace supports_compression(). The parsing simply determines if the build has the library, it doesn't say if a particular tool was modified to support the algorithm. I might build --with-zstd and yet pg_dump does not support that algorithm yet. Even after we add zstd to pg_dump, it's quite likely other compression algorithms may not be supported by pg_dump from day 1. I haven't looked at / tested the patch yet, but I wonder if you have any thoughts regarding the size_t / int tweaks. I don't know what types zstd library uses, how it reports errors etc. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, On 3/17/23 03:43, Tomas Vondra wrote: > > ... > >>> I'm a little suspicious of the replacement of supports_compression() >>> with parse_compress_specification(). For example: >>> >>>> - errmsg = supports_compression(AH->compression_spec); >>>> - if (errmsg) >>>> + parse_compress_specification(AH->compression_spec.algorithm, >>>> + NULL, &compress_spec); >>>> + if (compress_spec.parse_error != NULL) >>>> { >>>> pg_log_warning("archive is compressed, but this installation does not support compression (%s >>>> - errmsg); >>>> - pg_free(errmsg); >>>> + compress_spec.parse_error); >>>> + pg_free(compress_spec.parse_error); >>>> } >>> >>> The top-level error here is "does not support compression", but wouldn't >>> a bad specification option with a supported compression method trip this >>> path too? >> >> No - since the 2nd argument is passed as NULL, it just checks whether >> the compression is supported. Maybe there ought to be a more >> direct/clean way to do it. But up to now evidently nobody needed to do >> that. >> > > I don't think the patch can use parse_compress_specification() instead > of replace supports_compression(). The parsing simply determines if the > build has the library, it doesn't say if a particular tool was modified > to support the algorithm. I might build --with-zstd and yet pg_dump does > not support that algorithm yet. > > Even after we add zstd to pg_dump, it's quite likely other compression > algorithms may not be supported by pg_dump from day 1. > > > I haven't looked at / tested the patch yet, but I wonder if you have any > thoughts regarding the size_t / int tweaks. I don't know what types zstd > library uses, how it reports errors etc. > Any thoughts regarding my comments on removing supports_compression()? Also, this patch needs a rebase to adopt it to the API changes from last week. The sooner the better, considering we're getting fairly close to the end of the CF and code freeze. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Mar 17, 2023 at 03:43:31AM +0100, Tomas Vondra wrote: > On 3/16/23 05:50, Justin Pryzby wrote: > > On Fri, Mar 10, 2023 at 12:48:13PM -0800, Jacob Champion wrote: > >> On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion <jchampion@timescale.com> wrote: > >>> I did some smoke testing against zstd's GitHub release on Windows. To > >>> build against it, I had to construct an import library, and put that > >>> and the DLL into the `lib` folder expected by the MSVC scripts... > >>> which makes me wonder if I've chosen a harder way than necessary? > >> > >> It looks like pg_dump's meson.build is missing dependencies on zstd > >> (meson couldn't find the headers in the subproject without them). > > > > I saw that this was added for LZ4, but I hadn't added it for zstd since > > I didn't run into an issue without it. Could you check that what I've > > added works for your case ? > > > >>> Parallel zstd dumps seem to work as expected, in that the resulting > >>> pg_restore output is identical to uncompressed dumps and nothing > >>> explodes. I haven't inspected the threading implementation for safety > >>> yet, as you mentioned. > >> > >> Hm. Best I can tell, the CloneArchive() machinery is supposed to be > >> handling safety for this, by isolating each thread's state. I don't feel > >> comfortable pronouncing this new addition safe or not, because I'm not > >> sure I understand what the comments in the format-specific _Clone() > >> callbacks are saying yet. > > > > My line of reasoning for unix is that pg_dump forks before any calls to > > zstd. Nothing zstd does ought to affect the pg_dump layer. But that > > doesn't apply to pg_dump under windows. This is an opened question. If > > there's no solid answer, I could disable/ignore the option (maybe only > > under windows). > > I may be missing something, but why would the patch affect this? Why > would it even affect safety of the parallel dump? And I don't see any > changes to the clone stuff ... zstd supports using threads during compression, with -Z zstd:workers=N. When unix forks, the child processes can't do anything to mess up the state of the parent processes. But windows pg_dump uses threads instead of forking, so it seems possible that the pg_dump -j threads that then spawn zstd threads could "leak threads" and break the main thread. I suspect there's no issue, but we still ought to verify that before declaring it safe. > > The function is first checking if it was passed a filename which already > > has a suffix. And if not, it searches through a list of suffixes, > > testing for an existing file with each suffix. The search with stat() > > doesn't happen if it has a suffix. I'm having trouble seeing how the > > hasSuffix() branch isn't dead code. Another opened question. > > AFAICS it's done this way because of this comment in pg_backup_directory > > * ... > * ".gz" suffix is added to the filenames. The TOC files are never > * compressed by pg_dump, however they are accepted with the .gz suffix > * too, in case the user has manually compressed them with 'gzip'. > > I haven't tried, but I believe that if you manually compress the > directory, it may hit this branch. That would make sense, but when I tried, it didn't work like that. The filenames were all uncompressed names. Maybe it worked differently in an older release. Or maybe it changed during development of the parallel-directory-dump patch and it's actually dead code. This is rebased over the updated compression API. It seems like I misunderstood something you said before, so now I put back "supports_compression()". -- Justin
Attachment
On 3/27/23 19:28, Justin Pryzby wrote: > On Fri, Mar 17, 2023 at 03:43:31AM +0100, Tomas Vondra wrote: >> On 3/16/23 05:50, Justin Pryzby wrote: >>> On Fri, Mar 10, 2023 at 12:48:13PM -0800, Jacob Champion wrote: >>>> On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion <jchampion@timescale.com> wrote: >>>>> I did some smoke testing against zstd's GitHub release on Windows. To >>>>> build against it, I had to construct an import library, and put that >>>>> and the DLL into the `lib` folder expected by the MSVC scripts... >>>>> which makes me wonder if I've chosen a harder way than necessary? >>>> >>>> It looks like pg_dump's meson.build is missing dependencies on zstd >>>> (meson couldn't find the headers in the subproject without them). >>> >>> I saw that this was added for LZ4, but I hadn't added it for zstd since >>> I didn't run into an issue without it. Could you check that what I've >>> added works for your case ? >>> >>>>> Parallel zstd dumps seem to work as expected, in that the resulting >>>>> pg_restore output is identical to uncompressed dumps and nothing >>>>> explodes. I haven't inspected the threading implementation for safety >>>>> yet, as you mentioned. >>>> >>>> Hm. Best I can tell, the CloneArchive() machinery is supposed to be >>>> handling safety for this, by isolating each thread's state. I don't feel >>>> comfortable pronouncing this new addition safe or not, because I'm not >>>> sure I understand what the comments in the format-specific _Clone() >>>> callbacks are saying yet. >>> >>> My line of reasoning for unix is that pg_dump forks before any calls to >>> zstd. Nothing zstd does ought to affect the pg_dump layer. But that >>> doesn't apply to pg_dump under windows. This is an opened question. If >>> there's no solid answer, I could disable/ignore the option (maybe only >>> under windows). >> >> I may be missing something, but why would the patch affect this? Why >> would it even affect safety of the parallel dump? And I don't see any >> changes to the clone stuff ... > > zstd supports using threads during compression, with -Z zstd:workers=N. > When unix forks, the child processes can't do anything to mess up the > state of the parent processes. > > But windows pg_dump uses threads instead of forking, so it seems > possible that the pg_dump -j threads that then spawn zstd threads could > "leak threads" and break the main thread. I suspect there's no issue, > but we still ought to verify that before declaring it safe. > OK. I don't have access to a Windows machine so I can't test that. Is it possible to disable the zstd threading, until we figure this out? >>> The function is first checking if it was passed a filename which already >>> has a suffix. And if not, it searches through a list of suffixes, >>> testing for an existing file with each suffix. The search with stat() >>> doesn't happen if it has a suffix. I'm having trouble seeing how the >>> hasSuffix() branch isn't dead code. Another opened question. >> >> AFAICS it's done this way because of this comment in pg_backup_directory >> >> * ... >> * ".gz" suffix is added to the filenames. The TOC files are never >> * compressed by pg_dump, however they are accepted with the .gz suffix >> * too, in case the user has manually compressed them with 'gzip'. >> >> I haven't tried, but I believe that if you manually compress the >> directory, it may hit this branch. > > That would make sense, but when I tried, it didn't work like that. > The filenames were all uncompressed names. Maybe it worked differently > in an older release. Or maybe it changed during development of the > parallel-directory-dump patch and it's actually dead code. > Interesting. Would be good to find out. I wonder if a little bit of git-log digging could tell us more. Anyway, until we confirm it's dead code, we should probably do what .gz does and have the same check for .lz4 and .zst files. > This is rebased over the updated compression API. > > It seems like I misunderstood something you said before, so now I put > back "supports_compression()". > Thanks! I need to do a bit more testing and review, but it seems pretty much RFC to me, assuming we can figure out what to do about threading. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Mar 28, 2023 at 12:23 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
On 3/27/23 19:28, Justin Pryzby wrote:
> On Fri, Mar 17, 2023 at 03:43:31AM +0100, Tomas Vondra wrote:
>> On 3/16/23 05:50, Justin Pryzby wrote:
>>> On Fri, Mar 10, 2023 at 12:48:13PM -0800, Jacob Champion wrote:
>>>> On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion <jchampion@timescale.com> wrote:
>>>>> I did some smoke testing against zstd's GitHub release on Windows. To
...
OK. I don't have access to a Windows machine so I can't test that. Is it
possible to disable the zstd threading, until we figure this out?
Thomas since I appear to be one of the few windows users (I use both), can I help?
I can test pg_dump... for you, easy to do. I do about 5-10 pg_dumps a day on windows while developing.
Also, I have an AWS instance I created to build PG/Win with readline back in November.
I could give you access to that... (you are not the only person who has made this statement here).
I've made such instances available for other Open Source developers, to support them.
I can test pg_dump... for you, easy to do. I do about 5-10 pg_dumps a day on windows while developing.
Also, I have an AWS instance I created to build PG/Win with readline back in November.
I could give you access to that... (you are not the only person who has made this statement here).
I've made such instances available for other Open Source developers, to support them.
Obvi I would share connection credentials privately.
Regards, Kirk
On 3/28/23 20:03, Kirk Wolak wrote: > On Tue, Mar 28, 2023 at 12:23 PM Tomas Vondra > <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>> > wrote: > > On 3/27/23 19:28, Justin Pryzby wrote: > > On Fri, Mar 17, 2023 at 03:43:31AM +0100, Tomas Vondra wrote: > >> On 3/16/23 05:50, Justin Pryzby wrote: > >>> On Fri, Mar 10, 2023 at 12:48:13PM -0800, Jacob Champion wrote: > >>>> On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion > <jchampion@timescale.com <mailto:jchampion@timescale.com>> wrote: > >>>>> I did some smoke testing against zstd's GitHub release on > Windows. To > ... > OK. I don't have access to a Windows machine so I can't test that. Is it > possible to disable the zstd threading, until we figure this out? > > Thomas since I appear to be one of the few windows users (I use both), > can I help? > I can test pg_dump... for you, easy to do. I do about 5-10 pg_dumps a > day on windows while developing. > Perhaps. But I'll leave the details up to Justin - it's his patch, and I'm not sure how to verify the threading is OK. I'd try applying this patch, build with --with-zstd and then run the pg_dump TAP tests, and perhaps do some manual tests. And perhaps do the same for --with-lz4 - there's a thread [1] suggesting we don't detect lz4 stuff on Windows, so the TAP tests do nothing. https://www.postgresql.org/message-id/ZAjL96N9ZW84U59p@msg.df7cb.de > Also, I have an AWS instance I created to build PG/Win with readline > back in November. > I could give you access to that... (you are not the only person who has > made this statement here). > I've made such instances available for other Open Source developers, to > support them. > > Obvi I would share connection credentials privately. > I'd rather leave the Windows stuff up to someone with more experience with that platform. I have plenty of other stuff on my plate atm. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Mar 15, 2023 at 9:50 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > On Fri, Mar 10, 2023 at 12:48:13PM -0800, Jacob Champion wrote: > > It looks like pg_dump's meson.build is missing dependencies on zstd > > (meson couldn't find the headers in the subproject without them). > > I saw that this was added for LZ4, but I hadn't added it for zstd since > I didn't run into an issue without it. Could you check that what I've > added works for your case ? I thought I replied to this, sorry -- your newest patchset builds correctly with subprojects, so the new dependency looks good to me. Thanks! > > Hm. Best I can tell, the CloneArchive() machinery is supposed to be > > handling safety for this, by isolating each thread's state. I don't feel > > comfortable pronouncing this new addition safe or not, because I'm not > > sure I understand what the comments in the format-specific _Clone() > > callbacks are saying yet. > > My line of reasoning for unix is that pg_dump forks before any calls to > zstd. Nothing zstd does ought to affect the pg_dump layer. But that > doesn't apply to pg_dump under windows. This is an opened question. If > there's no solid answer, I could disable/ignore the option (maybe only > under windows). To (maybe?) move this forward a bit, note that pg_backup_custom's _Clone() function makes sure that there is no active compressor state at the beginning of the new thread. pg_backup_directory's implementation has no such provision. And I don't think it can, because the parent thread might have concurrently set one up -- see the directory-specific implementation of _CloseArchive(). Perhaps we should just NULL out those fields after the copy, instead? To illustrate why I think this is tough to characterize: if I've read the code correctly, the _Clone() and CloneArchive() implementations are running concurrently with code that is actively modifying the ArchiveHandle and the lclContext. So safety is only ensured to the extent that we keep track of which fields threads are allowed to touch, and I don't have that mental model. --Jacob
On Tue, Mar 28, 2023 at 02:03:49PM -0400, Kirk Wolak wrote: > On Tue, Mar 28, 2023 at 12:23 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > On 3/27/23 19:28, Justin Pryzby wrote: > > > On Fri, Mar 17, 2023 at 03:43:31AM +0100, Tomas Vondra wrote: > > >> On 3/16/23 05:50, Justin Pryzby wrote: > > >>> On Fri, Mar 10, 2023 at 12:48:13PM -0800, Jacob Champion wrote: > > >>>> On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion <jchampion@timescale.com> wrote: > > >>>>> I did some smoke testing against zstd's GitHub release on Windows. To > > ... > > OK. I don't have access to a Windows machine so I can't test that. Is it > > possible to disable the zstd threading, until we figure this out? > > Thomas since I appear to be one of the few windows users (I use both), can I help? > I can test pg_dump... for you, easy to do. I do about 5-10 pg_dumps a day > on windows while developing. It'd be great if you'd exercise this and other changes to pg_dump/restore. Tomas just pushed a bugfix, so be sure to "git pull" before testing, or else you might rediscover the bug. If you have a zstd library with thread support, you could test with -Z zstd:workers=3. But I think threads aren't enabled in the common libzstd packages. Jacob figured out how to compile libzstd easily using "meson wraps" - but I don't know the details. -- Justin
On Wed, Mar 29, 2023 at 6:35 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > If you have a zstd library with thread support, you could test with > -Z zstd:workers=3. But I think threads aren't enabled in the common > libzstd packages. Jacob figured out how to compile libzstd easily using > "meson wraps" - but I don't know the details. From the source root, $ mkdir subprojects $ meson wrap install zstd From then on, Meson was pretty automagical about it during the ninja build. The subproject's settings are themselves inspectable and settable via `meson configure`: $ meson configure -Dzstd:<option>=<value> --Jacob
On Tue, Mar 28, 2023 at 06:23:26PM +0200, Tomas Vondra wrote: > On 3/27/23 19:28, Justin Pryzby wrote: > > On Fri, Mar 17, 2023 at 03:43:31AM +0100, Tomas Vondra wrote: > >> On 3/16/23 05:50, Justin Pryzby wrote: > >>> On Fri, Mar 10, 2023 at 12:48:13PM -0800, Jacob Champion wrote: > >>>> On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion <jchampion@timescale.com> wrote: > >>>>> I did some smoke testing against zstd's GitHub release on Windows. To > >>>>> build against it, I had to construct an import library, and put that > >>>>> and the DLL into the `lib` folder expected by the MSVC scripts... > >>>>> which makes me wonder if I've chosen a harder way than necessary? > >>>> > >>>> It looks like pg_dump's meson.build is missing dependencies on zstd > >>>> (meson couldn't find the headers in the subproject without them). > >>> > >>> I saw that this was added for LZ4, but I hadn't added it for zstd since > >>> I didn't run into an issue without it. Could you check that what I've > >>> added works for your case ? > >>> > >>>>> Parallel zstd dumps seem to work as expected, in that the resulting > >>>>> pg_restore output is identical to uncompressed dumps and nothing > >>>>> explodes. I haven't inspected the threading implementation for safety > >>>>> yet, as you mentioned. > >>>> > >>>> Hm. Best I can tell, the CloneArchive() machinery is supposed to be > >>>> handling safety for this, by isolating each thread's state. I don't feel > >>>> comfortable pronouncing this new addition safe or not, because I'm not > >>>> sure I understand what the comments in the format-specific _Clone() > >>>> callbacks are saying yet. > >>> > >>> My line of reasoning for unix is that pg_dump forks before any calls to > >>> zstd. Nothing zstd does ought to affect the pg_dump layer. But that > >>> doesn't apply to pg_dump under windows. This is an opened question. If > >>> there's no solid answer, I could disable/ignore the option (maybe only > >>> under windows). > >> > >> I may be missing something, but why would the patch affect this? Why > >> would it even affect safety of the parallel dump? And I don't see any > >> changes to the clone stuff ... > > > > zstd supports using threads during compression, with -Z zstd:workers=N. > > When unix forks, the child processes can't do anything to mess up the > > state of the parent processes. > > > > But windows pg_dump uses threads instead of forking, so it seems > > possible that the pg_dump -j threads that then spawn zstd threads could > > "leak threads" and break the main thread. I suspect there's no issue, > > but we still ought to verify that before declaring it safe. > > OK. I don't have access to a Windows machine so I can't test that. Is it > possible to disable the zstd threading, until we figure this out? I think that's what's best. I made it issue a warning if "workers" was specified. It could also be an error, or just ignored. I considered disabling workers only for windows, but realized that I haven't tested with threads myself - my local zstd package is compiled without threading, and I remember having some issue recompiling it with threading. Jacob's recipe for using meson wraps works well, but it still seems better to leave it as a future feature. I used that recipe to enabled zstd with threading on CI (except for linux/autoconf). > >>> The function is first checking if it was passed a filename which already > >>> has a suffix. And if not, it searches through a list of suffixes, > >>> testing for an existing file with each suffix. The search with stat() > >>> doesn't happen if it has a suffix. I'm having trouble seeing how the > >>> hasSuffix() branch isn't dead code. Another opened question. > >> > >> AFAICS it's done this way because of this comment in pg_backup_directory > >> > >> * ... > >> * ".gz" suffix is added to the filenames. The TOC files are never > >> * compressed by pg_dump, however they are accepted with the .gz suffix > >> * too, in case the user has manually compressed them with 'gzip'. > >> > >> I haven't tried, but I believe that if you manually compress the > >> directory, it may hit this branch. > > > > That would make sense, but when I tried, it didn't work like that. > > The filenames were all uncompressed names. Maybe it worked differently > > in an older release. Or maybe it changed during development of the > > parallel-directory-dump patch and it's actually dead code. > > Interesting. Would be good to find out. I wonder if a little bit of > git-log digging could tell us more. Anyway, until we confirm it's dead > code, we should probably do what .gz does and have the same check for > .lz4 and .zst files. I found that hasSuffix() and cfopen() originated in the refactored patch Heikki's sent here; there's no history beyond that. https://www.postgresql.org/message-id/4D3954C7.9060503%40enterprisedb.com The patch published there appends the .gz within cfopen(), and the caller writes into the TOC the filename without ".gz". It seems like maybe a few hours prior, Heikki may have been appending the ".gz" suffix in the caller, and then writing the TOC with filename.gz. The only way I've been able to get a "filename.gz" passed to hasSuffix is to write a directory-format dump, with LOs, and without compression, and then compress the blobs with "gzip", and *also* edit the blobs.toc file to say ".gz" (which isn't necessary since, if the original file isn't found, the restore would search for files with compressed suffixes). So .. it's not *technically* unreachable, but I can't see why it'd be useful to support editing the *content* of the blob TOC (other than compressing it). I might give some weight to the idea if it were also possible to edit the non-blob TOC; but, it's a binary file, so no. For now, I made the change to make zstd and lz4 to behave the same here as .gz, unless Heikki has a memory or a git reflog going back far enough to further support the idea that the code path isn't useful. I'm going to set the patch as RFC as a hint to anyone who would want to make a final review. -- Justin
Attachment
On 4/1/23 01:16, Justin Pryzby wrote: > On Tue, Mar 28, 2023 at 06:23:26PM +0200, Tomas Vondra wrote: >> On 3/27/23 19:28, Justin Pryzby wrote: >>> On Fri, Mar 17, 2023 at 03:43:31AM +0100, Tomas Vondra wrote: >>>> On 3/16/23 05:50, Justin Pryzby wrote: >>>>> On Fri, Mar 10, 2023 at 12:48:13PM -0800, Jacob Champion wrote: >>>>>> On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion <jchampion@timescale.com> wrote: >>>>>>> I did some smoke testing against zstd's GitHub release on Windows. To >>>>>>> build against it, I had to construct an import library, and put that >>>>>>> and the DLL into the `lib` folder expected by the MSVC scripts... >>>>>>> which makes me wonder if I've chosen a harder way than necessary? >>>>>> >>>>>> It looks like pg_dump's meson.build is missing dependencies on zstd >>>>>> (meson couldn't find the headers in the subproject without them). >>>>> >>>>> I saw that this was added for LZ4, but I hadn't added it for zstd since >>>>> I didn't run into an issue without it. Could you check that what I've >>>>> added works for your case ? >>>>> >>>>>>> Parallel zstd dumps seem to work as expected, in that the resulting >>>>>>> pg_restore output is identical to uncompressed dumps and nothing >>>>>>> explodes. I haven't inspected the threading implementation for safety >>>>>>> yet, as you mentioned. >>>>>> >>>>>> Hm. Best I can tell, the CloneArchive() machinery is supposed to be >>>>>> handling safety for this, by isolating each thread's state. I don't feel >>>>>> comfortable pronouncing this new addition safe or not, because I'm not >>>>>> sure I understand what the comments in the format-specific _Clone() >>>>>> callbacks are saying yet. >>>>> >>>>> My line of reasoning for unix is that pg_dump forks before any calls to >>>>> zstd. Nothing zstd does ought to affect the pg_dump layer. But that >>>>> doesn't apply to pg_dump under windows. This is an opened question. If >>>>> there's no solid answer, I could disable/ignore the option (maybe only >>>>> under windows). >>>> >>>> I may be missing something, but why would the patch affect this? Why >>>> would it even affect safety of the parallel dump? And I don't see any >>>> changes to the clone stuff ... >>> >>> zstd supports using threads during compression, with -Z zstd:workers=N. >>> When unix forks, the child processes can't do anything to mess up the >>> state of the parent processes. >>> >>> But windows pg_dump uses threads instead of forking, so it seems >>> possible that the pg_dump -j threads that then spawn zstd threads could >>> "leak threads" and break the main thread. I suspect there's no issue, >>> but we still ought to verify that before declaring it safe. >> >> OK. I don't have access to a Windows machine so I can't test that. Is it >> possible to disable the zstd threading, until we figure this out? > > I think that's what's best. I made it issue a warning if "workers" was > specified. It could also be an error, or just ignored. > > I considered disabling workers only for windows, but realized that I > haven't tested with threads myself - my local zstd package is compiled > without threading, and I remember having some issue recompiling it with > threading. Jacob's recipe for using meson wraps works well, but it > still seems better to leave it as a future feature. I used that recipe > to enabled zstd with threading on CI (except for linux/autoconf). > +1 to disable this if we're unsure it works correctly. I agree it's better to just error out if workers are requested - I rather dislike when a tool just ignores an explicit parameter. And AFAICS it's what zstd does too, when someone requests workers on incompatible build. FWIW I've been thinking about this a bit more and I don't quite see why would the threading cause issues (except for Windows). I forgot pg_basebackup already supports zstd, including the worker threading, so why would it work there and not in pg_dump? Sure, pg_basebackup is not parallel, but with separate pg_dump processes that shouldn't be an issue (although I'm not sure when zstd creates threads). The one thing I'm wondering about is at which point are the worker threads spawned - but presumably not before the pg_dump processes fork. I'll try building zstd with threading enabled, and do some tests over the weekend. >>>>> The function is first checking if it was passed a filename which already >>>>> has a suffix. And if not, it searches through a list of suffixes, >>>>> testing for an existing file with each suffix. The search with stat() >>>>> doesn't happen if it has a suffix. I'm having trouble seeing how the >>>>> hasSuffix() branch isn't dead code. Another opened question. >>>> >>>> AFAICS it's done this way because of this comment in pg_backup_directory >>>> >>>> * ... >>>> * ".gz" suffix is added to the filenames. The TOC files are never >>>> * compressed by pg_dump, however they are accepted with the .gz suffix >>>> * too, in case the user has manually compressed them with 'gzip'. >>>> >>>> I haven't tried, but I believe that if you manually compress the >>>> directory, it may hit this branch. >>> >>> That would make sense, but when I tried, it didn't work like that. >>> The filenames were all uncompressed names. Maybe it worked differently >>> in an older release. Or maybe it changed during development of the >>> parallel-directory-dump patch and it's actually dead code. >> >> Interesting. Would be good to find out. I wonder if a little bit of >> git-log digging could tell us more. Anyway, until we confirm it's dead >> code, we should probably do what .gz does and have the same check for >> .lz4 and .zst files. > > I found that hasSuffix() and cfopen() originated in the refactored patch > Heikki's sent here; there's no history beyond that. > > https://www.postgresql.org/message-id/4D3954C7.9060503%40enterprisedb.com > > The patch published there appends the .gz within cfopen(), and the > caller writes into the TOC the filename without ".gz". It seems like > maybe a few hours prior, Heikki may have been appending the ".gz" suffix > in the caller, and then writing the TOC with filename.gz. > > The only way I've been able to get a "filename.gz" passed to hasSuffix > is to write a directory-format dump, with LOs, and without compression, > and then compress the blobs with "gzip", and *also* edit the blobs.toc > file to say ".gz" (which isn't necessary since, if the original file > isn't found, the restore would search for files with compressed > suffixes). > > So .. it's not *technically* unreachable, but I can't see why it'd be > useful to support editing the *content* of the blob TOC (other than > compressing it). I might give some weight to the idea if it were also > possible to edit the non-blob TOC; but, it's a binary file, so no. > > For now, I made the change to make zstd and lz4 to behave the same here > as .gz, unless Heikki has a memory or a git reflog going back far enough > to further support the idea that the code path isn't useful. > Makes sense. Let's keep the same behavior for all compression methods, and if it's unreachable we shall remove it from all. It's a trivial amount of code, we can live with that. > I'm going to set the patch as RFC as a hint to anyone who would want to > make a final review. > OK. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Apr 01, 2023 at 02:11:12AM +0200, Tomas Vondra wrote: > On 4/1/23 01:16, Justin Pryzby wrote: > > On Tue, Mar 28, 2023 at 06:23:26PM +0200, Tomas Vondra wrote: > >> On 3/27/23 19:28, Justin Pryzby wrote: > >>> On Fri, Mar 17, 2023 at 03:43:31AM +0100, Tomas Vondra wrote: > >>>> On 3/16/23 05:50, Justin Pryzby wrote: > >>>>> On Fri, Mar 10, 2023 at 12:48:13PM -0800, Jacob Champion wrote: > >>>>>> On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion <jchampion@timescale.com> wrote: > >>>>>>> I did some smoke testing against zstd's GitHub release on Windows. To > >>>>>>> build against it, I had to construct an import library, and put that > >>>>>>> and the DLL into the `lib` folder expected by the MSVC scripts... > >>>>>>> which makes me wonder if I've chosen a harder way than necessary? > >>>>>> > >>>>>> It looks like pg_dump's meson.build is missing dependencies on zstd > >>>>>> (meson couldn't find the headers in the subproject without them). > >>>>> > >>>>> I saw that this was added for LZ4, but I hadn't added it for zstd since > >>>>> I didn't run into an issue without it. Could you check that what I've > >>>>> added works for your case ? > >>>>> > >>>>>>> Parallel zstd dumps seem to work as expected, in that the resulting > >>>>>>> pg_restore output is identical to uncompressed dumps and nothing > >>>>>>> explodes. I haven't inspected the threading implementation for safety > >>>>>>> yet, as you mentioned. > >>>>>> > >>>>>> Hm. Best I can tell, the CloneArchive() machinery is supposed to be > >>>>>> handling safety for this, by isolating each thread's state. I don't feel > >>>>>> comfortable pronouncing this new addition safe or not, because I'm not > >>>>>> sure I understand what the comments in the format-specific _Clone() > >>>>>> callbacks are saying yet. > >>>>> > >>>>> My line of reasoning for unix is that pg_dump forks before any calls to > >>>>> zstd. Nothing zstd does ought to affect the pg_dump layer. But that > >>>>> doesn't apply to pg_dump under windows. This is an opened question. If > >>>>> there's no solid answer, I could disable/ignore the option (maybe only > >>>>> under windows). > >>>> > >>>> I may be missing something, but why would the patch affect this? Why > >>>> would it even affect safety of the parallel dump? And I don't see any > >>>> changes to the clone stuff ... > >>> > >>> zstd supports using threads during compression, with -Z zstd:workers=N. > >>> When unix forks, the child processes can't do anything to mess up the > >>> state of the parent processes. > >>> > >>> But windows pg_dump uses threads instead of forking, so it seems > >>> possible that the pg_dump -j threads that then spawn zstd threads could > >>> "leak threads" and break the main thread. I suspect there's no issue, > >>> but we still ought to verify that before declaring it safe. > >> > >> OK. I don't have access to a Windows machine so I can't test that. Is it > >> possible to disable the zstd threading, until we figure this out? > > > > I think that's what's best. I made it issue a warning if "workers" was > > specified. It could also be an error, or just ignored. > > > > I considered disabling workers only for windows, but realized that I > > haven't tested with threads myself - my local zstd package is compiled > > without threading, and I remember having some issue recompiling it with > > threading. Jacob's recipe for using meson wraps works well, but it > > still seems better to leave it as a future feature. I used that recipe > > to enabled zstd with threading on CI (except for linux/autoconf). > > +1 to disable this if we're unsure it works correctly. I agree it's > better to just error out if workers are requested - I rather dislike > when a tool just ignores an explicit parameter. And AFAICS it's what > zstd does too, when someone requests workers on incompatible build. > > FWIW I've been thinking about this a bit more and I don't quite see why > would the threading cause issues (except for Windows). I forgot > pg_basebackup already supports zstd, including the worker threading, so > why would it work there and not in pg_dump? Sure, pg_basebackup is not > parallel, but with separate pg_dump processes that shouldn't be an issue > (although I'm not sure when zstd creates threads). There's no concern at all except under windows (because on windows pg_dump -j is implemented using threads rather than forking). Especially since zstd:workers is already allowed in the basebackup backend process. > I'll try building zstd with threading enabled, and do some tests over > the weekend. Feel free to wait until v17 :) I used "meson wraps" to get a local version with threading. Note that if you want to use a zstd subproject, you may have to specify -D zstd=enabled, or else meson may not enable the library at all. Also, in order to introspect its settings, I had to do like this: mkdir subprojects meson wrap install zstd meson subprojects download mkdir build.meson meson setup -C build.meson --force-fallback-for=zstd -- Justin
On 4/1/23 02:28, Justin Pryzby wrote: > On Sat, Apr 01, 2023 at 02:11:12AM +0200, Tomas Vondra wrote: >> On 4/1/23 01:16, Justin Pryzby wrote: >>> On Tue, Mar 28, 2023 at 06:23:26PM +0200, Tomas Vondra wrote: >>>> On 3/27/23 19:28, Justin Pryzby wrote: >>>>> On Fri, Mar 17, 2023 at 03:43:31AM +0100, Tomas Vondra wrote: >>>>>> On 3/16/23 05:50, Justin Pryzby wrote: >>>>>>> On Fri, Mar 10, 2023 at 12:48:13PM -0800, Jacob Champion wrote: >>>>>>>> On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion <jchampion@timescale.com> wrote: >>>>>>>>> I did some smoke testing against zstd's GitHub release on Windows. To >>>>>>>>> build against it, I had to construct an import library, and put that >>>>>>>>> and the DLL into the `lib` folder expected by the MSVC scripts... >>>>>>>>> which makes me wonder if I've chosen a harder way than necessary? >>>>>>>> >>>>>>>> It looks like pg_dump's meson.build is missing dependencies on zstd >>>>>>>> (meson couldn't find the headers in the subproject without them). >>>>>>> >>>>>>> I saw that this was added for LZ4, but I hadn't added it for zstd since >>>>>>> I didn't run into an issue without it. Could you check that what I've >>>>>>> added works for your case ? >>>>>>> >>>>>>>>> Parallel zstd dumps seem to work as expected, in that the resulting >>>>>>>>> pg_restore output is identical to uncompressed dumps and nothing >>>>>>>>> explodes. I haven't inspected the threading implementation for safety >>>>>>>>> yet, as you mentioned. >>>>>>>> >>>>>>>> Hm. Best I can tell, the CloneArchive() machinery is supposed to be >>>>>>>> handling safety for this, by isolating each thread's state. I don't feel >>>>>>>> comfortable pronouncing this new addition safe or not, because I'm not >>>>>>>> sure I understand what the comments in the format-specific _Clone() >>>>>>>> callbacks are saying yet. >>>>>>> >>>>>>> My line of reasoning for unix is that pg_dump forks before any calls to >>>>>>> zstd. Nothing zstd does ought to affect the pg_dump layer. But that >>>>>>> doesn't apply to pg_dump under windows. This is an opened question. If >>>>>>> there's no solid answer, I could disable/ignore the option (maybe only >>>>>>> under windows). >>>>>> >>>>>> I may be missing something, but why would the patch affect this? Why >>>>>> would it even affect safety of the parallel dump? And I don't see any >>>>>> changes to the clone stuff ... >>>>> >>>>> zstd supports using threads during compression, with -Z zstd:workers=N. >>>>> When unix forks, the child processes can't do anything to mess up the >>>>> state of the parent processes. >>>>> >>>>> But windows pg_dump uses threads instead of forking, so it seems >>>>> possible that the pg_dump -j threads that then spawn zstd threads could >>>>> "leak threads" and break the main thread. I suspect there's no issue, >>>>> but we still ought to verify that before declaring it safe. >>>> >>>> OK. I don't have access to a Windows machine so I can't test that. Is it >>>> possible to disable the zstd threading, until we figure this out? >>> >>> I think that's what's best. I made it issue a warning if "workers" was >>> specified. It could also be an error, or just ignored. >>> >>> I considered disabling workers only for windows, but realized that I >>> haven't tested with threads myself - my local zstd package is compiled >>> without threading, and I remember having some issue recompiling it with >>> threading. Jacob's recipe for using meson wraps works well, but it >>> still seems better to leave it as a future feature. I used that recipe >>> to enabled zstd with threading on CI (except for linux/autoconf). >> >> +1 to disable this if we're unsure it works correctly. I agree it's >> better to just error out if workers are requested - I rather dislike >> when a tool just ignores an explicit parameter. And AFAICS it's what >> zstd does too, when someone requests workers on incompatible build. >> >> FWIW I've been thinking about this a bit more and I don't quite see why >> would the threading cause issues (except for Windows). I forgot >> pg_basebackup already supports zstd, including the worker threading, so >> why would it work there and not in pg_dump? Sure, pg_basebackup is not >> parallel, but with separate pg_dump processes that shouldn't be an issue >> (although I'm not sure when zstd creates threads). > > There's no concern at all except under windows (because on windows > pg_dump -j is implemented using threads rather than forking). > Especially since zstd:workers is already allowed in the basebackup > backend process. > If there are no concerns, why disable it outside Windows? I don't have a good idea how beneficial the multi-threaded compression is, so I can't quite judge the risk/benefits tradeoff. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Apr 01, 2023 at 02:49:44PM +0200, Tomas Vondra wrote: > On 4/1/23 02:28, Justin Pryzby wrote: > > On Sat, Apr 01, 2023 at 02:11:12AM +0200, Tomas Vondra wrote: > >> On 4/1/23 01:16, Justin Pryzby wrote: > >>> On Tue, Mar 28, 2023 at 06:23:26PM +0200, Tomas Vondra wrote: > >>>> On 3/27/23 19:28, Justin Pryzby wrote: > >>>>> On Fri, Mar 17, 2023 at 03:43:31AM +0100, Tomas Vondra wrote: > >>>>>> On 3/16/23 05:50, Justin Pryzby wrote: > >>>>>>> On Fri, Mar 10, 2023 at 12:48:13PM -0800, Jacob Champion wrote: > >>>>>>>> On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion <jchampion@timescale.com> wrote: > >>>>>>>>> I did some smoke testing against zstd's GitHub release on Windows. To > >>>>>>>>> build against it, I had to construct an import library, and put that > >>>>>>>>> and the DLL into the `lib` folder expected by the MSVC scripts... > >>>>>>>>> which makes me wonder if I've chosen a harder way than necessary? > >>>>>>>> > >>>>>>>> It looks like pg_dump's meson.build is missing dependencies on zstd > >>>>>>>> (meson couldn't find the headers in the subproject without them). > >>>>>>> > >>>>>>> I saw that this was added for LZ4, but I hadn't added it for zstd since > >>>>>>> I didn't run into an issue without it. Could you check that what I've > >>>>>>> added works for your case ? > >>>>>>> > >>>>>>>>> Parallel zstd dumps seem to work as expected, in that the resulting > >>>>>>>>> pg_restore output is identical to uncompressed dumps and nothing > >>>>>>>>> explodes. I haven't inspected the threading implementation for safety > >>>>>>>>> yet, as you mentioned. > >>>>>>>> > >>>>>>>> Hm. Best I can tell, the CloneArchive() machinery is supposed to be > >>>>>>>> handling safety for this, by isolating each thread's state. I don't feel > >>>>>>>> comfortable pronouncing this new addition safe or not, because I'm not > >>>>>>>> sure I understand what the comments in the format-specific _Clone() > >>>>>>>> callbacks are saying yet. > >>>>>>> > >>>>>>> My line of reasoning for unix is that pg_dump forks before any calls to > >>>>>>> zstd. Nothing zstd does ought to affect the pg_dump layer. But that > >>>>>>> doesn't apply to pg_dump under windows. This is an opened question. If > >>>>>>> there's no solid answer, I could disable/ignore the option (maybe only > >>>>>>> under windows). > >>>>>> > >>>>>> I may be missing something, but why would the patch affect this? Why > >>>>>> would it even affect safety of the parallel dump? And I don't see any > >>>>>> changes to the clone stuff ... > >>>>> > >>>>> zstd supports using threads during compression, with -Z zstd:workers=N. > >>>>> When unix forks, the child processes can't do anything to mess up the > >>>>> state of the parent processes. > >>>>> > >>>>> But windows pg_dump uses threads instead of forking, so it seems > >>>>> possible that the pg_dump -j threads that then spawn zstd threads could > >>>>> "leak threads" and break the main thread. I suspect there's no issue, > >>>>> but we still ought to verify that before declaring it safe. > >>>> > >>>> OK. I don't have access to a Windows machine so I can't test that. Is it > >>>> possible to disable the zstd threading, until we figure this out? > >>> > >>> I think that's what's best. I made it issue a warning if "workers" was > >>> specified. It could also be an error, or just ignored. > >>> > >>> I considered disabling workers only for windows, but realized that I > >>> haven't tested with threads myself - my local zstd package is compiled > >>> without threading, and I remember having some issue recompiling it with > >>> threading. Jacob's recipe for using meson wraps works well, but it > >>> still seems better to leave it as a future feature. I used that recipe > >>> to enabled zstd with threading on CI (except for linux/autoconf). > >> > >> +1 to disable this if we're unsure it works correctly. I agree it's > >> better to just error out if workers are requested - I rather dislike > >> when a tool just ignores an explicit parameter. And AFAICS it's what > >> zstd does too, when someone requests workers on incompatible build. > >> > >> FWIW I've been thinking about this a bit more and I don't quite see why > >> would the threading cause issues (except for Windows). I forgot > >> pg_basebackup already supports zstd, including the worker threading, so > >> why would it work there and not in pg_dump? Sure, pg_basebackup is not > >> parallel, but with separate pg_dump processes that shouldn't be an issue > >> (although I'm not sure when zstd creates threads). > > > > There's no concern at all except under windows (because on windows > > pg_dump -j is implemented using threads rather than forking). > > Especially since zstd:workers is already allowed in the basebackup > > backend process. > > If there are no concerns, why disable it outside Windows? I don't have a > good idea how beneficial the multi-threaded compression is, so I can't > quite judge the risk/benefits tradeoff. Because it's a minor/fringe feature, and it's annoying to have platform differences (would we plan on relaxing the restriction in v17, or is it more likely we'd forget ?). I realized how little I've tested with zstd workers myself. And I think on cirrusci, the macos and freebsd tasks have zstd libraries with threading support, but it wasn't being exercised (because using :workers would cause the patch to fail unless it's supported everywhere). So I updated the "for CI only" patch to 1) use meson wraps to compile zstd library with threading on linux and windows; and, 2) use zstd:workers=3 "opportunistically" (but avoid failing if threads are not supported, since the autoconf task still doesn't have access to a library with thread support). That's a great step, but it still seems bad that the thread stuff has been little exercised until now. (Also, the windows task failed; I think that's due to a transient network issue). Feel free to mess around with threads (but I'd much rather see the patch progress for zstd:long). -- Justin
On 4/1/23 15:36, Justin Pryzby wrote: > > ... > >> If there are no concerns, why disable it outside Windows? I don't have a >> good idea how beneficial the multi-threaded compression is, so I can't >> quite judge the risk/benefits tradeoff. > > Because it's a minor/fringe feature, and it's annoying to have platform > differences (would we plan on relaxing the restriction in v17, or is it > more likely we'd forget ?). > > I realized how little I've tested with zstd workers myself. And I think > on cirrusci, the macos and freebsd tasks have zstd libraries with > threading support, but it wasn't being exercised (because using :workers > would cause the patch to fail unless it's supported everywhere). So I > updated the "for CI only" patch to 1) use meson wraps to compile zstd > library with threading on linux and windows; and, 2) use zstd:workers=3 > "opportunistically" (but avoid failing if threads are not supported, > since the autoconf task still doesn't have access to a library with > thread support). That's a great step, but it still seems bad that the > thread stuff has been little exercised until now. (Also, the windows > task failed; I think that's due to a transient network issue). > Agreed, let's leave the threading for PG17, depending on how beneficial it turns out to be for pg_dump. > Feel free to mess around with threads (but I'd much rather see the patch > progress for zstd:long). OK, understood. The long mode patch is pretty simple. IIUC it does not change the format, i.e. in the worst case we could leave it for PG17 too. Correct? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Apr 01, 2023 at 10:26:01PM +0200, Tomas Vondra wrote: > > Feel free to mess around with threads (but I'd much rather see the patch > > progress for zstd:long). > > OK, understood. The long mode patch is pretty simple. IIUC it does not > change the format, i.e. in the worst case we could leave it for PG17 > too. Correct? Right, libzstd only has one "format", which is the same as what's used by the commandline tool. zstd:long doesn't change the format of the output: the library just uses a larger memory buffer to allow better compression. There's no format change for zstd:workers, either. -- Justin
On 4/3/23 21:17, Justin Pryzby wrote: > On Sat, Apr 01, 2023 at 10:26:01PM +0200, Tomas Vondra wrote: >>> Feel free to mess around with threads (but I'd much rather see the patch >>> progress for zstd:long). >> >> OK, understood. The long mode patch is pretty simple. IIUC it does not >> change the format, i.e. in the worst case we could leave it for PG17 >> too. Correct? > > Right, libzstd only has one "format", which is the same as what's used > by the commandline tool. zstd:long doesn't change the format of the > output: the library just uses a larger memory buffer to allow better > compression. There's no format change for zstd:workers, either. > OK. I plan to do a bit more review/testing on this, and get it committed over the next day or two, likely including the long mode. One thing I noticed today is that maybe long_distance should be a bool, not int. Yes, ZSTD_c_enableLongDistanceMatching() accepts int, but it'd be cleaner to cast the value during a call and keep it bool otherwise. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Apr 03, 2023 at 11:26:09PM +0200, Tomas Vondra wrote: > On 4/3/23 21:17, Justin Pryzby wrote: > > On Sat, Apr 01, 2023 at 10:26:01PM +0200, Tomas Vondra wrote: > >>> Feel free to mess around with threads (but I'd much rather see the patch > >>> progress for zstd:long). > >> > >> OK, understood. The long mode patch is pretty simple. IIUC it does not > >> change the format, i.e. in the worst case we could leave it for PG17 > >> too. Correct? > > > > Right, libzstd only has one "format", which is the same as what's used > > by the commandline tool. zstd:long doesn't change the format of the > > output: the library just uses a larger memory buffer to allow better > > compression. There's no format change for zstd:workers, either. > > OK. I plan to do a bit more review/testing on this, and get it committed > over the next day or two, likely including the long mode. One thing I > noticed today is that maybe long_distance should be a bool, not int. > Yes, ZSTD_c_enableLongDistanceMatching() accepts int, but it'd be > cleaner to cast the value during a call and keep it bool otherwise. Thanks for noticing. Evidently I wrote it using "int" to get the feature working, and then later wrote the bool parsing bits but never changed the data structure. This also updates a few comments, indentation, removes a useless assertion, and updates the warning about zstd:workers. -- Justin
Attachment
On 4/4/23 05:04, Justin Pryzby wrote: > On Mon, Apr 03, 2023 at 11:26:09PM +0200, Tomas Vondra wrote: >> On 4/3/23 21:17, Justin Pryzby wrote: >>> On Sat, Apr 01, 2023 at 10:26:01PM +0200, Tomas Vondra wrote: >>>>> Feel free to mess around with threads (but I'd much rather see the patch >>>>> progress for zstd:long). >>>> >>>> OK, understood. The long mode patch is pretty simple. IIUC it does not >>>> change the format, i.e. in the worst case we could leave it for PG17 >>>> too. Correct? >>> >>> Right, libzstd only has one "format", which is the same as what's used >>> by the commandline tool. zstd:long doesn't change the format of the >>> output: the library just uses a larger memory buffer to allow better >>> compression. There's no format change for zstd:workers, either. >> >> OK. I plan to do a bit more review/testing on this, and get it committed >> over the next day or two, likely including the long mode. One thing I >> noticed today is that maybe long_distance should be a bool, not int. >> Yes, ZSTD_c_enableLongDistanceMatching() accepts int, but it'd be >> cleaner to cast the value during a call and keep it bool otherwise. > > Thanks for noticing. Evidently I wrote it using "int" to get the > feature working, and then later wrote the bool parsing bits but never > changed the data structure. > > This also updates a few comments, indentation, removes a useless > assertion, and updates the warning about zstd:workers. > Thanks. I've cleaned up the 0001 a little bit (a couple comment improvements), updated the commit message and pushed it. I plan to take care of the 0002 (long distance mode) tomorrow, and that'll be it for PG16 I think. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 4/5/23 21:42, Tomas Vondra wrote: > On 4/4/23 05:04, Justin Pryzby wrote: >> On Mon, Apr 03, 2023 at 11:26:09PM +0200, Tomas Vondra wrote: >>> On 4/3/23 21:17, Justin Pryzby wrote: >>>> On Sat, Apr 01, 2023 at 10:26:01PM +0200, Tomas Vondra wrote: >>>>>> Feel free to mess around with threads (but I'd much rather see the patch >>>>>> progress for zstd:long). >>>>> >>>>> OK, understood. The long mode patch is pretty simple. IIUC it does not >>>>> change the format, i.e. in the worst case we could leave it for PG17 >>>>> too. Correct? >>>> >>>> Right, libzstd only has one "format", which is the same as what's used >>>> by the commandline tool. zstd:long doesn't change the format of the >>>> output: the library just uses a larger memory buffer to allow better >>>> compression. There's no format change for zstd:workers, either. >>> >>> OK. I plan to do a bit more review/testing on this, and get it committed >>> over the next day or two, likely including the long mode. One thing I >>> noticed today is that maybe long_distance should be a bool, not int. >>> Yes, ZSTD_c_enableLongDistanceMatching() accepts int, but it'd be >>> cleaner to cast the value during a call and keep it bool otherwise. >> >> Thanks for noticing. Evidently I wrote it using "int" to get the >> feature working, and then later wrote the bool parsing bits but never >> changed the data structure. >> >> This also updates a few comments, indentation, removes a useless >> assertion, and updates the warning about zstd:workers. >> > > Thanks. I've cleaned up the 0001 a little bit (a couple comment > improvements), updated the commit message and pushed it. I plan to take > care of the 0002 (long distance mode) tomorrow, and that'll be it for > PG16 I think. I looked at the long mode patch again, updated the commit message and pushed it. I was wondering if long_mode should really be bool - logically it is, but ZSTD_CCtx_setParameter() expects int. But I think that's fine. I think that's all for PG16 in this patch series. If there's more we want to do, it'll have to wait for PG17 - Justin, can you update and submit the patches that you think are relevant for the next CF? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Apr 06, 2023 at 05:34:30PM +0200, Tomas Vondra wrote: > I looked at the long mode patch again, updated the commit message and > pushed it. I was wondering if long_mode should really be bool - > logically it is, but ZSTD_CCtx_setParameter() expects int. But I think > that's fine. Thanks! > I think that's all for PG16 in this patch series. If there's more we want to > do, it'll have to wait for PG17 - Yes > Justin, can you update and submit the patches that you think are relevant for > the next CF? Yeah. It sounds like a shiny new feature, but it's not totally clear if it's safe here or even how useful it is. (It might be like my patch for wal_compression=zstd:level, and Michael's for toast_compression=zstd, neither of which saw any support). Last year's basebackup thread had some interesting comments about safety of threads, although pg_dump's considerations may be different. The patch itself is trivial, so it'd be fine to wait until PG16 is released to get some experience. If someone else wanted to do that, it'd be fine with me. -- Justin
Justin Pryzby <pryzby@telsasoft.com> writes: > On Thu, Apr 06, 2023 at 05:34:30PM +0200, Tomas Vondra wrote: >> I think that's all for PG16 in this patch series. If there's more we want to >> do, it'll have to wait for PG17 - > Yes Shouldn't the CF entry be closed as committed? It's certainly making the cfbot unhappy because the patch-of-record doesn't apply. regards, tom lane