Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward |
Date | |
Msg-id | 1306492.1760556110@sss.pgh.pa.us Whole thread Raw |
In response to | Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
I wrote: > 0004 increases the row width in the existing test case that says > it's trying to push more than DEFAULT_IO_BUFFER_SIZE through > the compressors. While I agree with the premise, this solution > is hugely expensive: it adds about 12% to the already-long runtime > of 002_pg_dump.pl. I'd like to find a better way, but ran out of > energy for today. (I think the reason this costs so much is that > it's effectively iterated hundreds of times because of > 002_pg_dump.pl's more or less cross-product approach to testing > everything. Maybe we should pull it out of that structure?) The attached patchset accomplishes that by splitting 002_pg_dump.pl into two scripts, one that is just concerned with the compression test cases and one that does everything else. This might not be the prettiest solution, since it duplicates a lot of perl code. I thought about refactoring 002_pg_dump.pl so that it could handle two separate sets of runs-plus-tests, but decided it was overly complicated already. Anyway, 0001 attached is the same as in v4, 0002 performs the test split without intending to change coverage, and then 0003 adds the new test cases I wanted. For me, this ends up with just about the same runtime as before, or maybe a smidge less. I'd hoped for possibly more savings than that, but I'm content with it being a wash. I think this is more or less committable, and then we could get back to the original question of whether it's worth tweaking pg_restore's seek-vs-scan behavior. regards, tom lane From cf923236ab86f8fedc6bc865a8754bb9fa26f252 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Wed, 15 Oct 2025 14:48:42 -0400 Subject: [PATCH v5 1/3] Align the data block sizes of pg_dump's various compression modes. After commit fe8192a95, compress_zstd.c tends to produce data block sizes around 128K, and we don't really have any control over that unless we want to overrule ZSTD_CStreamOutSize(). Which seems like a bad idea. But let's try to align the other compression modes to produce block sizes roughly comparable to that, so that pg_restore's skip-data performance isn't enormously different for different modes. gzip compression can be brought in line simply by setting DEFAULT_IO_BUFFER_SIZE = 128K, which this patch does. That increases some unrelated buffer sizes, but none of them seem problematic for modern platforms. lz4's idea of appropriate block size is highly nonlinear: if we just increase DEFAULT_IO_BUFFER_SIZE then the output blocks end up around 200K. I found that adjusting the slop factor in LZ4State_compression_init was a not-too-ugly way of bringing that number roughly into line. With compress = none you get data blocks the same sizes as the table rows, which seems potentially problematic for narrow tables. Introduce a layer of buffering to make that case match the others. Comments in compress_io.h and 002_pg_dump.pl suggest that if we increase DEFAULT_IO_BUFFER_SIZE then we need to increase the amount of data fed through the tests in order to improve coverage. I've not done that here, leaving it for a separate patch. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us --- src/bin/pg_dump/compress_io.h | 4 +- src/bin/pg_dump/compress_lz4.c | 9 ++++- src/bin/pg_dump/compress_none.c | 64 +++++++++++++++++++++++++++++++- src/tools/pgindent/typedefs.list | 1 + 4 files changed, 72 insertions(+), 6 deletions(-) diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h index 25a7bf0904d..ae008585c89 100644 --- a/src/bin/pg_dump/compress_io.h +++ b/src/bin/pg_dump/compress_io.h @@ -22,9 +22,9 @@ * * When changing this value, it's necessary to check the relevant test cases * still exercise all the branches. This applies especially if the value is - * increased, in which case the overflow buffer may not be needed. + * increased, in which case some loops may not get iterated. */ -#define DEFAULT_IO_BUFFER_SIZE 4096 +#define DEFAULT_IO_BUFFER_SIZE (128 * 1024) extern char *supports_compression(const pg_compress_specification compression_spec); diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c index b817a083d38..450afd4e2be 100644 --- a/src/bin/pg_dump/compress_lz4.c +++ b/src/bin/pg_dump/compress_lz4.c @@ -100,9 +100,14 @@ LZ4State_compression_init(LZ4State *state) state->buflen = LZ4F_compressBound(DEFAULT_IO_BUFFER_SIZE, &state->prefs); /* - * Then double it, to ensure we're not forced to flush every time. + * Add some slop to ensure we're not forced to flush every time. + * + * The present slop factor of 50% is chosen so that the typical output + * block size is about 128K when DEFAULT_IO_BUFFER_SIZE = 128K. We might + * need a different slop factor to maintain that equivalence if + * DEFAULT_IO_BUFFER_SIZE is changed dramatically. */ - state->buflen *= 2; + state->buflen += state->buflen / 2; /* * LZ4F_compressBegin requires a buffer that is greater or equal to diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c index 4abb2e95abc..94c155a572d 100644 --- a/src/bin/pg_dump/compress_none.c +++ b/src/bin/pg_dump/compress_none.c @@ -22,6 +22,18 @@ *---------------------- */ +/* + * We buffer outgoing data, just to ensure that data blocks written to the + * archive file are of reasonable size. The read side could use this struct, + * but there's no need because it does not retain data across calls. + */ +typedef struct NoneCompressorState +{ + char *buffer; /* buffer for unwritten data */ + size_t buflen; /* allocated size of buffer */ + size_t bufdata; /* amount of valid data currently in buffer */ +} NoneCompressorState; + /* * Private routines */ @@ -49,13 +61,45 @@ static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs, const void *data, size_t dLen) { - cs->writeF(AH, data, dLen); + NoneCompressorState *nonecs = (NoneCompressorState *) cs->private_data; + size_t remaining = dLen; + + while (remaining > 0) + { + size_t chunk; + + /* Dump buffer if full */ + if (nonecs->bufdata >= nonecs->buflen) + { + cs->writeF(AH, nonecs->buffer, nonecs->bufdata); + nonecs->bufdata = 0; + } + /* And fill it */ + chunk = nonecs->buflen - nonecs->bufdata; + if (chunk > remaining) + chunk = remaining; + memcpy(nonecs->buffer + nonecs->bufdata, data, chunk); + nonecs->bufdata += chunk; + data = ((const char *) data) + chunk; + remaining -= chunk; + } } static void EndCompressorNone(ArchiveHandle *AH, CompressorState *cs) { - /* no op */ + NoneCompressorState *nonecs = (NoneCompressorState *) cs->private_data; + + if (nonecs) + { + /* Dump buffer if nonempty */ + if (nonecs->bufdata > 0) + cs->writeF(AH, nonecs->buffer, nonecs->bufdata); + /* Free working state */ + pg_free(nonecs->buffer); + pg_free(nonecs); + cs->private_data = NULL; + } } /* @@ -71,6 +115,22 @@ InitCompressorNone(CompressorState *cs, cs->end = EndCompressorNone; cs->compression_spec = compression_spec; + + /* + * If the caller has defined a write function, prepare the necessary + * buffer. + */ + if (cs->writeF) + { + NoneCompressorState *nonecs; + + nonecs = (NoneCompressorState *) pg_malloc(sizeof(NoneCompressorState)); + nonecs->buflen = DEFAULT_IO_BUFFER_SIZE; + nonecs->buffer = pg_malloc(nonecs->buflen); + nonecs->bufdata = 0; + + cs->private_data = nonecs; + } } diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index 5290b91e83e..63f9387044b 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -1758,6 +1758,7 @@ NextValueExpr Node NodeTag NonEmptyRange +NoneCompressorState Notification NotificationList NotifyStmt -- 2.43.7 From e45a7653c47ee11a12c036738ddfdd88ef6ad6cd Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Wed, 15 Oct 2025 15:02:04 -0400 Subject: [PATCH v5 2/3] Split 002_pg_dump.pl into two test files. Add a new test script 006_pg_dump_compress.pl, containing just the pg_dump tests specifically concerned with compression, and remove those tests from 002_pg_dump.pl. We can also drop some infrastructure in 002_pg_dump.pl that was used only for these tests. The point of this is to avoid the cost of running these test cases over and over in all the scenarios (runs) that 002_pg_dump.pl exercises. We don't learn anything more about the behavior of the compression code that way, and we expend significant amounts of time, since one of these test cases is quite large and due to get larger. The intent of this specific patch is to provide exactly the same coverage as before, except that I went back to using --no-sync in all the test runs moved over to 006_pg_dump_compress.pl. I think that avoiding that had basically been cargo-culted into these test cases as a result of modeling them on the defaults_custom_format test case; again, doing that over and over isn't going to teach us anything new. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us --- src/bin/pg_dump/meson.build | 1 + src/bin/pg_dump/t/002_pg_dump.pl | 408 --------------- src/bin/pg_dump/t/006_pg_dump_compress.pl | 611 ++++++++++++++++++++++ 3 files changed, 612 insertions(+), 408 deletions(-) create mode 100644 src/bin/pg_dump/t/006_pg_dump_compress.pl diff --git a/src/bin/pg_dump/meson.build b/src/bin/pg_dump/meson.build index a2233b0a1b4..f3c669f484e 100644 --- a/src/bin/pg_dump/meson.build +++ b/src/bin/pg_dump/meson.build @@ -102,6 +102,7 @@ tests += { 't/003_pg_dump_with_server.pl', 't/004_pg_dump_parallel.pl', 't/005_pg_dump_filterfile.pl', + 't/006_pg_dump_compress.pl', 't/010_dump_connstr.pl', ], }, diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl index 6be6888b977..b789cd2e863 100644 --- a/src/bin/pg_dump/t/002_pg_dump.pl +++ b/src/bin/pg_dump/t/002_pg_dump.pl @@ -20,22 +20,12 @@ my $tempdir = PostgreSQL::Test::Utils::tempdir; # test_key indicates that a given run should simply use the same # set of like/unlike tests as another run, and which run that is. # -# compile_option indicates if the commands run depend on a compilation -# option, if any. This can be used to control if tests should be -# skipped when a build dependency is not satisfied. -# # dump_cmd is the pg_dump command to run, which is an array of # the full command and arguments to run. Note that this is run # using $node->command_ok(), so the port does not need to be # specified and is pulled from $PGPORT, which is set by the # PostgreSQL::Test::Cluster system. # -# compress_cmd is the utility command for (de)compression, if any. -# Note that this should generally be used on pg_dump's output -# either to generate a text file to run the through the tests, or -# to test pg_restore's ability to parse manually compressed files -# that otherwise pg_dump does not compress on its own (e.g. *.toc). -# # glob_patterns is an optional array consisting of strings compilable # with glob() to check the files generated after a dump. # @@ -55,8 +45,6 @@ my $tempdir = PostgreSQL::Test::Utils::tempdir; my $supports_icu = ($ENV{with_icu} eq 'yes'); my $supports_gzip = check_pg_config("#define HAVE_LIBZ 1"); -my $supports_lz4 = check_pg_config("#define USE_LZ4 1"); -my $supports_zstd = check_pg_config("#define USE_ZSTD 1"); my %pgdump_runs = ( binary_upgrade => { @@ -81,256 +69,6 @@ my %pgdump_runs = ( ], }, - # Do not use --no-sync to give test coverage for data sync. - compression_gzip_custom => { - test_key => 'compression', - compile_option => 'gzip', - dump_cmd => [ - 'pg_dump', - '--format' => 'custom', - '--compress' => '1', - '--file' => "$tempdir/compression_gzip_custom.dump", - '--statistics', - 'postgres', - ], - restore_cmd => [ - 'pg_restore', - '--file' => "$tempdir/compression_gzip_custom.sql", - '--statistics', - "$tempdir/compression_gzip_custom.dump", - ], - command_like => { - command => [ - 'pg_restore', '--list', - "$tempdir/compression_gzip_custom.dump", - ], - expected => qr/Compression: gzip/, - name => 'data content is gzip-compressed' - }, - }, - - # Do not use --no-sync to give test coverage for data sync. - compression_gzip_dir => { - test_key => 'compression', - compile_option => 'gzip', - dump_cmd => [ - 'pg_dump', - '--jobs' => '2', - '--format' => 'directory', - '--compress' => 'gzip:1', - '--file' => "$tempdir/compression_gzip_dir", - '--statistics', - 'postgres', - ], - # Give coverage for manually compressed blobs.toc files during - # restore. - compress_cmd => { - program => $ENV{'GZIP_PROGRAM'}, - args => [ '-f', "$tempdir/compression_gzip_dir/blobs_*.toc", ], - }, - # Verify that only data files were compressed - glob_patterns => [ - "$tempdir/compression_gzip_dir/toc.dat", - "$tempdir/compression_gzip_dir/*.dat.gz", - ], - restore_cmd => [ - 'pg_restore', - '--jobs' => '2', - '--file' => "$tempdir/compression_gzip_dir.sql", - '--statistics', - "$tempdir/compression_gzip_dir", - ], - }, - - compression_gzip_plain => { - test_key => 'compression', - compile_option => 'gzip', - dump_cmd => [ - 'pg_dump', - '--format' => 'plain', - '--compress' => '1', - '--file' => "$tempdir/compression_gzip_plain.sql.gz", - '--statistics', - 'postgres', - ], - # Decompress the generated file to run through the tests. - compress_cmd => { - program => $ENV{'GZIP_PROGRAM'}, - args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ], - }, - }, - - # Do not use --no-sync to give test coverage for data sync. - compression_lz4_custom => { - test_key => 'compression', - compile_option => 'lz4', - dump_cmd => [ - 'pg_dump', - '--format' => 'custom', - '--compress' => 'lz4', - '--file' => "$tempdir/compression_lz4_custom.dump", - '--statistics', - 'postgres', - ], - restore_cmd => [ - 'pg_restore', - '--file' => "$tempdir/compression_lz4_custom.sql", - '--statistics', - "$tempdir/compression_lz4_custom.dump", - ], - command_like => { - command => [ - 'pg_restore', '--list', - "$tempdir/compression_lz4_custom.dump", - ], - expected => qr/Compression: lz4/, - name => 'data content is lz4 compressed' - }, - }, - - # Do not use --no-sync to give test coverage for data sync. - compression_lz4_dir => { - test_key => 'compression', - compile_option => 'lz4', - dump_cmd => [ - 'pg_dump', - '--jobs' => '2', - '--format' => 'directory', - '--compress' => 'lz4:1', - '--file' => "$tempdir/compression_lz4_dir", - '--statistics', - 'postgres', - ], - # Give coverage for manually compressed blobs.toc files during - # restore. - compress_cmd => { - program => $ENV{'LZ4'}, - args => [ - '-z', '-f', '-m', '--rm', - "$tempdir/compression_lz4_dir/blobs_*.toc", - ], - }, - # Verify that data files were compressed - glob_patterns => [ - "$tempdir/compression_lz4_dir/toc.dat", - "$tempdir/compression_lz4_dir/*.dat.lz4", - ], - restore_cmd => [ - 'pg_restore', - '--jobs' => '2', - '--file' => "$tempdir/compression_lz4_dir.sql", - '--statistics', - "$tempdir/compression_lz4_dir", - ], - }, - - compression_lz4_plain => { - test_key => 'compression', - compile_option => 'lz4', - dump_cmd => [ - 'pg_dump', - '--format' => 'plain', - '--compress' => 'lz4', - '--file' => "$tempdir/compression_lz4_plain.sql.lz4", - '--statistics', - 'postgres', - ], - # Decompress the generated file to run through the tests. - compress_cmd => { - program => $ENV{'LZ4'}, - args => [ - '-d', '-f', - "$tempdir/compression_lz4_plain.sql.lz4", - "$tempdir/compression_lz4_plain.sql", - ], - }, - }, - - compression_zstd_custom => { - test_key => 'compression', - compile_option => 'zstd', - dump_cmd => [ - 'pg_dump', - '--format' => 'custom', - '--compress' => 'zstd', - '--file' => "$tempdir/compression_zstd_custom.dump", - '--statistics', - 'postgres', - ], - restore_cmd => [ - 'pg_restore', - '--file' => "$tempdir/compression_zstd_custom.sql", - '--statistics', - "$tempdir/compression_zstd_custom.dump", - ], - command_like => { - command => [ - 'pg_restore', '--list', - "$tempdir/compression_zstd_custom.dump", - ], - expected => qr/Compression: zstd/, - name => 'data content is zstd compressed' - }, - }, - - compression_zstd_dir => { - test_key => 'compression', - compile_option => 'zstd', - dump_cmd => [ - 'pg_dump', - '--jobs' => '2', - '--format' => 'directory', - '--compress' => 'zstd:1', - '--file' => "$tempdir/compression_zstd_dir", - '--statistics', - 'postgres', - ], - # Give coverage for manually compressed blobs.toc files during - # restore. - compress_cmd => { - program => $ENV{'ZSTD'}, - args => [ - '-z', '-f', - '--rm', "$tempdir/compression_zstd_dir/blobs_*.toc", - ], - }, - # Verify that data files were compressed - glob_patterns => [ - "$tempdir/compression_zstd_dir/toc.dat", - "$tempdir/compression_zstd_dir/*.dat.zst", - ], - restore_cmd => [ - 'pg_restore', - '--jobs' => '2', - '--file' => "$tempdir/compression_zstd_dir.sql", - '--statistics', - "$tempdir/compression_zstd_dir", - ], - }, - - # Exercise long mode for test coverage - compression_zstd_plain => { - test_key => 'compression', - compile_option => 'zstd', - dump_cmd => [ - 'pg_dump', - '--format' => 'plain', - '--compress' => 'zstd:long', - '--file' => "$tempdir/compression_zstd_plain.sql.zst", - '--statistics', - 'postgres', - ], - # Decompress the generated file to run through the tests. - compress_cmd => { - program => $ENV{'ZSTD'}, - args => [ - '-d', '-f', - "$tempdir/compression_zstd_plain.sql.zst", "-o", - "$tempdir/compression_zstd_plain.sql", - ], - }, - }, - clean => { dump_cmd => [ 'pg_dump', '--no-sync', @@ -891,10 +629,6 @@ my %pgdump_runs = ( # of the pg_dump runs happening. This is what "seeds" the # system with objects to be dumped out. # -# There can be a flag called 'lz4', which can be set if the test -# case depends on LZ4. Tests marked with this flag are skipped if -# the build used does not support LZ4. -# # Building of this hash takes a bit of time as all of the regexps # included in it are compiled. This greatly improves performance # as the regexps are used for each run the test applies to. @@ -911,7 +645,6 @@ my %full_runs = ( binary_upgrade => 1, clean => 1, clean_if_exists => 1, - compression => 1, createdb => 1, defaults => 1, exclude_dump_test_schema => 1, @@ -3210,31 +2943,6 @@ my %tests = ( }, }, - 'CREATE MATERIALIZED VIEW matview_compression' => { - create_order => 20, - create_sql => 'CREATE MATERIALIZED VIEW - dump_test.matview_compression (col2) AS - SELECT col2 FROM dump_test.test_table; - ALTER MATERIALIZED VIEW dump_test.matview_compression - ALTER COLUMN col2 SET COMPRESSION lz4;', - regexp => qr/^ - \QCREATE MATERIALIZED VIEW dump_test.matview_compression AS\E - \n\s+\QSELECT col2\E - \n\s+\QFROM dump_test.test_table\E - \n\s+\QWITH NO DATA;\E - .* - \QALTER TABLE ONLY dump_test.matview_compression ALTER COLUMN col2 SET COMPRESSION lz4;\E\n - /xms, - lz4 => 1, - like => - { %full_runs, %dump_test_schema_runs, section_pre_data => 1, }, - unlike => { - exclude_dump_test_schema => 1, - no_toast_compression => 1, - only_dump_measurement => 1, - }, - }, - 'Check ordering of a matview that depends on a primary key' => { create_order => 42, create_sql => ' @@ -3691,51 +3399,6 @@ my %tests = ( }, }, - 'CREATE TABLE test_compression_method' => { - create_order => 110, - create_sql => 'CREATE TABLE dump_test.test_compression_method ( - col1 text - );', - regexp => qr/^ - \QCREATE TABLE dump_test.test_compression_method (\E\n - \s+\Qcol1 text\E\n - \Q);\E - /xm, - like => { - %full_runs, %dump_test_schema_runs, section_pre_data => 1, - }, - unlike => { - exclude_dump_test_schema => 1, - only_dump_measurement => 1, - }, - }, - - # Insert enough data to surpass DEFAULT_IO_BUFFER_SIZE during - # (de)compression operations - 'COPY test_compression_method' => { - create_order => 111, - create_sql => 'INSERT INTO dump_test.test_compression_method (col1) ' - . 'SELECT string_agg(a::text, \'\') FROM generate_series(1,4096) a;', - regexp => qr/^ - \QCOPY dump_test.test_compression_method (col1) FROM stdin;\E - \n(?:\d{15277}\n){1}\\\.\n - /xm, - like => { - %full_runs, - data_only => 1, - no_schema => 1, - section_data => 1, - only_dump_test_schema => 1, - test_schema_plus_large_objects => 1, - }, - unlike => { - binary_upgrade => 1, - exclude_dump_test_schema => 1, - schema_only => 1, - schema_only_with_statistics => 1, - }, - }, - 'CREATE TABLE fk_reference_test_table' => { create_order => 21, create_sql => 'CREATE TABLE dump_test.fk_reference_test_table ( @@ -3774,30 +3437,6 @@ my %tests = ( }, }, - 'CREATE TABLE test_compression' => { - create_order => 3, - create_sql => 'CREATE TABLE dump_test.test_compression ( - col1 int, - col2 text COMPRESSION lz4 - );', - regexp => qr/^ - \QCREATE TABLE dump_test.test_compression (\E\n - \s+\Qcol1 integer,\E\n - \s+\Qcol2 text\E\n - \);\n - .* - \QALTER TABLE ONLY dump_test.test_compression ALTER COLUMN col2 SET COMPRESSION lz4;\E\n - /xms, - lz4 => 1, - like => - { %full_runs, %dump_test_schema_runs, section_pre_data => 1, }, - unlike => { - exclude_dump_test_schema => 1, - no_toast_compression => 1, - only_dump_measurement => 1, - }, - }, - 'CREATE TABLE measurement PARTITIONED BY' => { create_order => 90, create_sql => 'CREATE TABLE dump_test.measurement ( @@ -5280,13 +4919,6 @@ foreach my $test ( next; } - # Skip tests specific to LZ4 if this build does not support - # this option. - if (!$supports_lz4 && defined($tests{$test}->{lz4})) - { - next; - } - # Normalize command ending: strip all line endings, add # semicolon if missing, add two newlines. my $create_sql = $tests{$test}->{create_sql}; @@ -5463,42 +5095,9 @@ foreach my $run (sort keys %pgdump_runs) my $test_key = $run; my $run_db = 'postgres'; - # Skip command-level tests for gzip/lz4/zstd if the tool is not supported - if ($pgdump_runs{$run}->{compile_option} - && (($pgdump_runs{$run}->{compile_option} eq 'gzip' - && !$supports_gzip) - || ($pgdump_runs{$run}->{compile_option} eq 'lz4' - && !$supports_lz4) - || ($pgdump_runs{$run}->{compile_option} eq 'zstd' - && !$supports_zstd))) - { - note - "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support"; - next; - } - $node->command_ok(\@{ $pgdump_runs{$run}->{dump_cmd} }, "$run: pg_dump runs"); - if ($pgdump_runs{$run}->{compress_cmd}) - { - my ($compress_cmd) = $pgdump_runs{$run}->{compress_cmd}; - my $compress_program = $compress_cmd->{program}; - - # Skip the rest of the test if the compression program is - # not defined. - next if (!defined($compress_program) || $compress_program eq ''); - - # Arguments may require globbing. - my @full_compress_cmd = ($compress_program); - foreach my $arg (@{ $compress_cmd->{args} }) - { - push @full_compress_cmd, glob($arg); - } - - command_ok(\@full_compress_cmd, "$run: compression commands"); - } - if ($pgdump_runs{$run}->{glob_patterns}) { my $glob_patterns = $pgdump_runs{$run}->{glob_patterns}; @@ -5579,13 +5178,6 @@ foreach my $run (sort keys %pgdump_runs) next; } - # Skip tests specific to LZ4 if this build does not support - # this option. - if (!$supports_lz4 && defined($tests{$test}->{lz4})) - { - next; - } - if ($run_db ne $test_db) { next; diff --git a/src/bin/pg_dump/t/006_pg_dump_compress.pl b/src/bin/pg_dump/t/006_pg_dump_compress.pl new file mode 100644 index 00000000000..3737132645b --- /dev/null +++ b/src/bin/pg_dump/t/006_pg_dump_compress.pl @@ -0,0 +1,611 @@ + +# Copyright (c) 2021-2025, PostgreSQL Global Development Group + +############################################################### +# This test script uses essentially the same structure as +# 002_pg_dump.pl, but is specialized to deal with compression +# concerns. As such, some of the test cases here are large +# and would contribute undue amounts of runtime if they were +# included in 002_pg_dump.pl. +############################################################### + +use strict; +use warnings FATAL => 'all'; + +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More; + +my $tempdir = PostgreSQL::Test::Utils::tempdir; + +############################################################### +# Definition of the pg_dump runs to make. +# +# In addition to the facilities explained in 002_pg_dump.pl, +# these entries can include: +# +# compile_option indicates if the test depends on a compilation +# option, if any. This can be used to control if tests should be +# skipped when a build dependency is not satisfied. +# +# compress_cmd is the utility command for (de)compression, if any. +# Note that this should generally be used on pg_dump's output +# either to generate a text file to run the through the tests, or +# to test pg_restore's ability to parse manually compressed files +# that otherwise pg_dump does not compress on its own (e.g. *.toc). + +my $supports_gzip = check_pg_config("#define HAVE_LIBZ 1"); +my $supports_lz4 = check_pg_config("#define USE_LZ4 1"); +my $supports_zstd = check_pg_config("#define USE_ZSTD 1"); + +my %pgdump_runs = ( + compression_gzip_custom => { + test_key => 'compression', + compile_option => 'gzip', + dump_cmd => [ + 'pg_dump', '--no-sync', + '--format' => 'custom', + '--compress' => '1', + '--file' => "$tempdir/compression_gzip_custom.dump", + '--statistics', + 'postgres', + ], + restore_cmd => [ + 'pg_restore', + '--file' => "$tempdir/compression_gzip_custom.sql", + '--statistics', + "$tempdir/compression_gzip_custom.dump", + ], + command_like => { + command => [ + 'pg_restore', '--list', + "$tempdir/compression_gzip_custom.dump", + ], + expected => qr/Compression: gzip/, + name => 'data content is gzip-compressed' + }, + }, + + compression_gzip_dir => { + test_key => 'compression', + compile_option => 'gzip', + dump_cmd => [ + 'pg_dump', '--no-sync', + '--jobs' => '2', + '--format' => 'directory', + '--compress' => 'gzip:1', + '--file' => "$tempdir/compression_gzip_dir", + '--statistics', + 'postgres', + ], + # Give coverage for manually compressed blobs.toc files during + # restore. + compress_cmd => { + program => $ENV{'GZIP_PROGRAM'}, + args => [ '-f', "$tempdir/compression_gzip_dir/blobs_*.toc", ], + }, + # Verify that only data files were compressed + glob_patterns => [ + "$tempdir/compression_gzip_dir/toc.dat", + "$tempdir/compression_gzip_dir/*.dat.gz", + ], + restore_cmd => [ + 'pg_restore', + '--jobs' => '2', + '--file' => "$tempdir/compression_gzip_dir.sql", + '--statistics', + "$tempdir/compression_gzip_dir", + ], + }, + + compression_gzip_plain => { + test_key => 'compression', + compile_option => 'gzip', + dump_cmd => [ + 'pg_dump', '--no-sync', + '--format' => 'plain', + '--compress' => '1', + '--file' => "$tempdir/compression_gzip_plain.sql.gz", + '--statistics', + 'postgres', + ], + # Decompress the generated file to run through the tests. + compress_cmd => { + program => $ENV{'GZIP_PROGRAM'}, + args => [ '-d', "$tempdir/compression_gzip_plain.sql.gz", ], + }, + }, + + compression_lz4_custom => { + test_key => 'compression', + compile_option => 'lz4', + dump_cmd => [ + 'pg_dump', '--no-sync', + '--format' => 'custom', + '--compress' => 'lz4', + '--file' => "$tempdir/compression_lz4_custom.dump", + '--statistics', + 'postgres', + ], + restore_cmd => [ + 'pg_restore', + '--file' => "$tempdir/compression_lz4_custom.sql", + '--statistics', + "$tempdir/compression_lz4_custom.dump", + ], + command_like => { + command => [ + 'pg_restore', '--list', + "$tempdir/compression_lz4_custom.dump", + ], + expected => qr/Compression: lz4/, + name => 'data content is lz4 compressed' + }, + }, + + compression_lz4_dir => { + test_key => 'compression', + compile_option => 'lz4', + dump_cmd => [ + 'pg_dump', '--no-sync', + '--jobs' => '2', + '--format' => 'directory', + '--compress' => 'lz4:1', + '--file' => "$tempdir/compression_lz4_dir", + '--statistics', + 'postgres', + ], + # Give coverage for manually compressed blobs.toc files during + # restore. + compress_cmd => { + program => $ENV{'LZ4'}, + args => [ + '-z', '-f', '-m', '--rm', + "$tempdir/compression_lz4_dir/blobs_*.toc", + ], + }, + # Verify that data files were compressed + glob_patterns => [ + "$tempdir/compression_lz4_dir/toc.dat", + "$tempdir/compression_lz4_dir/*.dat.lz4", + ], + restore_cmd => [ + 'pg_restore', + '--jobs' => '2', + '--file' => "$tempdir/compression_lz4_dir.sql", + '--statistics', + "$tempdir/compression_lz4_dir", + ], + }, + + compression_lz4_plain => { + test_key => 'compression', + compile_option => 'lz4', + dump_cmd => [ + 'pg_dump', '--no-sync', + '--format' => 'plain', + '--compress' => 'lz4', + '--file' => "$tempdir/compression_lz4_plain.sql.lz4", + '--statistics', + 'postgres', + ], + # Decompress the generated file to run through the tests. + compress_cmd => { + program => $ENV{'LZ4'}, + args => [ + '-d', '-f', + "$tempdir/compression_lz4_plain.sql.lz4", + "$tempdir/compression_lz4_plain.sql", + ], + }, + }, + + compression_zstd_custom => { + test_key => 'compression', + compile_option => 'zstd', + dump_cmd => [ + 'pg_dump', '--no-sync', + '--format' => 'custom', + '--compress' => 'zstd', + '--file' => "$tempdir/compression_zstd_custom.dump", + '--statistics', + 'postgres', + ], + restore_cmd => [ + 'pg_restore', + '--file' => "$tempdir/compression_zstd_custom.sql", + '--statistics', + "$tempdir/compression_zstd_custom.dump", + ], + command_like => { + command => [ + 'pg_restore', '--list', + "$tempdir/compression_zstd_custom.dump", + ], + expected => qr/Compression: zstd/, + name => 'data content is zstd compressed' + }, + }, + + compression_zstd_dir => { + test_key => 'compression', + compile_option => 'zstd', + dump_cmd => [ + 'pg_dump', '--no-sync', + '--jobs' => '2', + '--format' => 'directory', + '--compress' => 'zstd:1', + '--file' => "$tempdir/compression_zstd_dir", + '--statistics', + 'postgres', + ], + # Give coverage for manually compressed blobs.toc files during + # restore. + compress_cmd => { + program => $ENV{'ZSTD'}, + args => [ + '-z', '-f', + '--rm', "$tempdir/compression_zstd_dir/blobs_*.toc", + ], + }, + # Verify that data files were compressed + glob_patterns => [ + "$tempdir/compression_zstd_dir/toc.dat", + "$tempdir/compression_zstd_dir/*.dat.zst", + ], + restore_cmd => [ + 'pg_restore', + '--jobs' => '2', + '--file' => "$tempdir/compression_zstd_dir.sql", + '--statistics', + "$tempdir/compression_zstd_dir", + ], + }, + + # Exercise long mode for test coverage + compression_zstd_plain => { + test_key => 'compression', + compile_option => 'zstd', + dump_cmd => [ + 'pg_dump', '--no-sync', + '--format' => 'plain', + '--compress' => 'zstd:long', + '--file' => "$tempdir/compression_zstd_plain.sql.zst", + '--statistics', + 'postgres', + ], + # Decompress the generated file to run through the tests. + compress_cmd => { + program => $ENV{'ZSTD'}, + args => [ + '-d', '-f', + "$tempdir/compression_zstd_plain.sql.zst", "-o", + "$tempdir/compression_zstd_plain.sql", + ], + }, + },); + +############################################################### +# Definition of the tests to run. +# +# In addition to the facilities explained in 002_pg_dump.pl, +# these entries can include: +# +# compile_option indicates if the test depends on a compilation +# option, if any. This can be used to control if tests should be +# skipped when a build dependency is not satisfied. + +# Tests which are considered 'full' dumps by pg_dump, but there +# are flags used to exclude specific items (ACLs, LOs, etc). +my %full_runs = (compression => 1,); + +# This is where the actual tests are defined. +my %tests = ( + 'CREATE MATERIALIZED VIEW matview_compression_lz4' => { + create_order => 20, + create_sql => 'CREATE MATERIALIZED VIEW + matview_compression_lz4 (col2) AS + SELECT repeat(\'xyzzy\', 10000); + ALTER MATERIALIZED VIEW matview_compression_lz4 + ALTER COLUMN col2 SET COMPRESSION lz4;', + regexp => qr/^ + \QCREATE MATERIALIZED VIEW public.matview_compression_lz4 AS\E + \n\s+\QSELECT repeat('xyzzy'::text, 10000) AS col2\E + \n\s+\QWITH NO DATA;\E + .* + \QALTER TABLE ONLY public.matview_compression_lz4 ALTER COLUMN col2 SET COMPRESSION lz4;\E\n + /xms, + compile_option => 'lz4', + like => {%full_runs}, + }, + + 'CREATE TABLE test_compression_method' => { + create_order => 110, + create_sql => 'CREATE TABLE test_compression_method ( + col1 text + );', + regexp => qr/^ + \QCREATE TABLE public.test_compression_method (\E\n + \s+\Qcol1 text\E\n + \Q);\E + /xm, + like => { %full_runs, }, + }, + + # Insert enough data to surpass DEFAULT_IO_BUFFER_SIZE during + # (de)compression operations + 'COPY test_compression_method' => { + create_order => 111, + create_sql => 'INSERT INTO test_compression_method (col1) ' + . 'SELECT string_agg(a::text, \'\') FROM generate_series(1,4096) a;', + regexp => qr/^ + \QCOPY public.test_compression_method (col1) FROM stdin;\E + \n(?:\d{15277}\n){1}\\\.\n + /xm, + like => { %full_runs, }, + }, + + 'CREATE TABLE test_compression' => { + create_order => 3, + create_sql => 'CREATE TABLE test_compression ( + col1 int, + col2 text COMPRESSION lz4 + );', + regexp => qr/^ + \QCREATE TABLE public.test_compression (\E\n + \s+\Qcol1 integer,\E\n + \s+\Qcol2 text\E\n + \);\n + .* + \QALTER TABLE ONLY public.test_compression ALTER COLUMN col2 SET COMPRESSION lz4;\E\n + /xms, + compile_option => 'lz4', + like => {%full_runs}, + }, + + # Create a large object so we can test compression of blobs.toc + 'LO create (using lo_from_bytea)' => { + create_order => 50, + create_sql => + 'SELECT pg_catalog.lo_from_bytea(0, \'\\x310a320a330a340a350a360a370a380a390a\');', + regexp => qr/^SELECT pg_catalog\.lo_create\('\d+'\);/m, + like => { %full_runs, }, + }, + + 'LO load (using lo_from_bytea)' => { + regexp => qr/^ + \QSELECT pg_catalog.lo_open\E \('\d+',\ \d+\);\n + \QSELECT pg_catalog.lowrite(0, \E + \Q'\x310a320a330a340a350a360a370a380a390a');\E\n + \QSELECT pg_catalog.lo_close(0);\E + /xm, + like => { %full_runs, }, + },); + +######################################### +# Create a PG instance to test actually dumping from + +my $node = PostgreSQL::Test::Cluster->new('main'); +$node->init; +$node->start; + +my $port = $node->port; + +######################################### +# Set up schemas, tables, etc, to be dumped. + +# Build up the create statements +my %create_sql = (); + +foreach my $test ( + sort { + if ($tests{$a}->{create_order} and $tests{$b}->{create_order}) + { + $tests{$a}->{create_order} <=> $tests{$b}->{create_order}; + } + elsif ($tests{$a}->{create_order}) + { + -1; + } + elsif ($tests{$b}->{create_order}) + { + 1; + } + else + { + 0; + } + } keys %tests) +{ + my $test_db = 'postgres'; + + if (defined($tests{$test}->{database})) + { + $test_db = $tests{$test}->{database}; + } + + # Skip tests that require an unsupported compile option + if ($tests{$test}->{compile_option} + && (($tests{$test}->{compile_option} eq 'gzip' && !$supports_gzip) + || ($tests{$test}->{compile_option} eq 'lz4' + && !$supports_lz4) + || ($tests{$test}->{compile_option} eq 'zstd' + && !$supports_zstd))) + { + next; + } + + if ($tests{$test}->{create_sql}) + { + # Normalize command ending: strip all line endings, add + # semicolon if missing, add two newlines. + my $create_sql = $tests{$test}->{create_sql}; + chomp $create_sql; + $create_sql .= ';' unless substr($create_sql, -1) eq ';'; + $create_sql{$test_db} .= $create_sql . "\n\n"; + } +} + +# Send the combined set of commands to psql +foreach my $db (sort keys %create_sql) +{ + $node->safe_psql($db, $create_sql{$db}); +} + +######################################### +# Run all runs + +foreach my $run (sort keys %pgdump_runs) +{ + my $test_key = $run; + my $run_db = 'postgres'; + + # Skip runs that require an unsupported compile option + if ($pgdump_runs{$run}->{compile_option} + && (($pgdump_runs{$run}->{compile_option} eq 'gzip' + && !$supports_gzip) + || ($pgdump_runs{$run}->{compile_option} eq 'lz4' + && !$supports_lz4) + || ($pgdump_runs{$run}->{compile_option} eq 'zstd' + && !$supports_zstd))) + { + note + "$run: skipped due to no $pgdump_runs{$run}->{compile_option} support"; + next; + } + + $node->command_ok(\@{ $pgdump_runs{$run}->{dump_cmd} }, + "$run: pg_dump runs"); + + if ($pgdump_runs{$run}->{compress_cmd}) + { + my ($compress_cmd) = $pgdump_runs{$run}->{compress_cmd}; + my $compress_program = $compress_cmd->{program}; + + # Skip the rest of the test if the compression program is + # not defined. + next if (!defined($compress_program) || $compress_program eq ''); + + # Arguments may require globbing. + my @full_compress_cmd = ($compress_program); + foreach my $arg (@{ $compress_cmd->{args} }) + { + push @full_compress_cmd, glob($arg); + } + + command_ok(\@full_compress_cmd, "$run: compression commands"); + } + + if ($pgdump_runs{$run}->{glob_patterns}) + { + my $glob_patterns = $pgdump_runs{$run}->{glob_patterns}; + foreach my $glob_pattern (@{$glob_patterns}) + { + my @glob_output = glob($glob_pattern); + is(scalar(@glob_output) > 0, + 1, "$run: glob check for $glob_pattern"); + } + } + + if ($pgdump_runs{$run}->{command_like}) + { + my $cmd_like = $pgdump_runs{$run}->{command_like}; + $node->command_like( + \@{ $cmd_like->{command} }, + $cmd_like->{expected}, + "$run: " . $cmd_like->{name}); + } + + if ($pgdump_runs{$run}->{restore_cmd}) + { + $node->command_ok(\@{ $pgdump_runs{$run}->{restore_cmd} }, + "$run: pg_restore runs"); + } + + if ($pgdump_runs{$run}->{test_key}) + { + $test_key = $pgdump_runs{$run}->{test_key}; + } + + my $output_file = slurp_file("$tempdir/${run}.sql"); + + ######################################### + # Run all tests where this run is included + # as either a 'like' or 'unlike' test. + + foreach my $test (sort keys %tests) + { + my $test_db = 'postgres'; + + if (defined($pgdump_runs{$run}->{database})) + { + $run_db = $pgdump_runs{$run}->{database}; + } + + if (defined($tests{$test}->{database})) + { + $test_db = $tests{$test}->{database}; + } + + # Check for proper test definitions + # + # Either "all_runs" should be set or there should be a "like" list, + # even if it is empty. (This makes the test more self-documenting.) + if ( !defined($tests{$test}->{all_runs}) + && !defined($tests{$test}->{like})) + { + die "missing \"like\" in test \"$test\""; + } + # Check for useless entries in "unlike" list. Runs that are + # not listed in "like" don't need to be excluded in "unlike". + if ($tests{$test}->{unlike}->{$test_key} + && !defined($tests{$test}->{like}->{$test_key})) + { + die "useless \"unlike\" entry \"$test_key\" in test \"$test\""; + } + + # Skip tests that require an unsupported compile option + if ($tests{$test}->{compile_option} + && (($tests{$test}->{compile_option} eq 'gzip' && !$supports_gzip) + || ($tests{$test}->{compile_option} eq 'lz4' + && !$supports_lz4) + || ($tests{$test}->{compile_option} eq 'zstd' + && !$supports_zstd))) + { + next; + } + + if ($run_db ne $test_db) + { + next; + } + + # Run the test if all_runs is set or if listed as a like, unless it is + # specifically noted as an unlike (generally due to an explicit + # exclusion or similar). + if (($tests{$test}->{like}->{$test_key} || $tests{$test}->{all_runs}) + && !defined($tests{$test}->{unlike}->{$test_key})) + { + if (!ok($output_file =~ $tests{$test}->{regexp}, + "$run: should dump $test")) + { + diag("Review $run results in $tempdir"); + } + } + else + { + if (!ok($output_file !~ $tests{$test}->{regexp}, + "$run: should not dump $test")) + { + diag("Review $run results in $tempdir"); + } + } + } +} + +######################################### +# Stop the database instance, which will be removed at the end of the tests. + +$node->stop('fast'); + +done_testing(); -- 2.43.7 From 42b2728fece03fc4cc5a14fa7d9081e04ec54e18 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Wed, 15 Oct 2025 15:12:22 -0400 Subject: [PATCH v5 3/3] Add more TAP test coverage for pg_dump. Add a test case to cover pg_dump with --compress=none. This brings the coverage of compress_none.c up from about 64% to 90%, in particular covering the new code added in a previous patch. Include compression of toc.dat in manually-compressed test cases. We would have found the bug fixed in commit a239c4a0c much sooner if we'd done this. As far as I can tell, this doesn't reduce test coverage at all, since there are other tests of directory format that still use an uncompressed toc.dat. Widen the wide row used to verify correct (de) compression. Commit 1a05c1d25 advises us (not without reason) to ensure that this test case fully fills DEFAULT_IO_BUFFER_SIZE, so that loops within the compression logic will iterate completely. To follow that advice with the proposed DEFAULT_IO_BUFFER_SIZE of 128K, we need something close to this. This does indeed increase the reported code coverage by a few lines. While here, fix a glitch that I noticed in testing: the $glob_patterns tests were incapable of failing, because glob() will return 'foo' as 'foo' whether there is a matching file or not. (Indeed, the stanza just above that one relies on that.) In my testing, this patch adds approximately as much runtime as was saved by the previous patch, so that it's about a wash compared to the old code. However, we get better test coverage. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us --- src/bin/pg_dump/t/002_pg_dump.pl | 8 ++- src/bin/pg_dump/t/006_pg_dump_compress.pl | 66 ++++++++++++++++------- 2 files changed, 52 insertions(+), 22 deletions(-) diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl index b789cd2e863..0e94d8151b7 100644 --- a/src/bin/pg_dump/t/002_pg_dump.pl +++ b/src/bin/pg_dump/t/002_pg_dump.pl @@ -5104,8 +5104,12 @@ foreach my $run (sort keys %pgdump_runs) foreach my $glob_pattern (@{$glob_patterns}) { my @glob_output = glob($glob_pattern); - is(scalar(@glob_output) > 0, - 1, "$run: glob check for $glob_pattern"); + my $ok = 0; + # certainly found some files if glob() returned multiple matches + $ok = 1 if (scalar(@glob_output) > 1); + # if just one match, we need to check if it's real + $ok = 1 if (scalar(@glob_output) == 1 && -f $glob_output[0]); + is($ok, 1, "$run: glob check for $glob_pattern"); } } diff --git a/src/bin/pg_dump/t/006_pg_dump_compress.pl b/src/bin/pg_dump/t/006_pg_dump_compress.pl index 3737132645b..b429df19a4d 100644 --- a/src/bin/pg_dump/t/006_pg_dump_compress.pl +++ b/src/bin/pg_dump/t/006_pg_dump_compress.pl @@ -39,6 +39,24 @@ my $supports_lz4 = check_pg_config("#define USE_LZ4 1"); my $supports_zstd = check_pg_config("#define USE_ZSTD 1"); my %pgdump_runs = ( + compression_none_custom => { + test_key => 'compression', + dump_cmd => [ + 'pg_dump', '--no-sync', + '--format' => 'custom', + '--compress' => 'none', + '--file' => "$tempdir/compression_none_custom.dump", + '--statistics', + 'postgres', + ], + restore_cmd => [ + 'pg_restore', + '--file' => "$tempdir/compression_none_custom.sql", + '--statistics', + "$tempdir/compression_none_custom.dump", + ], + }, + compression_gzip_custom => { test_key => 'compression', compile_option => 'gzip', @@ -78,15 +96,18 @@ my %pgdump_runs = ( '--statistics', 'postgres', ], - # Give coverage for manually compressed blobs.toc files during - # restore. + # Give coverage for manually-compressed TOC files during restore. compress_cmd => { program => $ENV{'GZIP_PROGRAM'}, - args => [ '-f', "$tempdir/compression_gzip_dir/blobs_*.toc", ], + args => [ + '-f', + "$tempdir/compression_gzip_dir/toc.dat", + "$tempdir/compression_gzip_dir/blobs_*.toc", + ], }, - # Verify that only data files were compressed + # Verify that TOC and data files were compressed glob_patterns => [ - "$tempdir/compression_gzip_dir/toc.dat", + "$tempdir/compression_gzip_dir/toc.dat.gz", "$tempdir/compression_gzip_dir/*.dat.gz", ], restore_cmd => [ @@ -155,18 +176,18 @@ my %pgdump_runs = ( '--statistics', 'postgres', ], - # Give coverage for manually compressed blobs.toc files during - # restore. + # Give coverage for manually-compressed TOC files during restore. compress_cmd => { program => $ENV{'LZ4'}, args => [ '-z', '-f', '-m', '--rm', + "$tempdir/compression_lz4_dir/toc.dat", "$tempdir/compression_lz4_dir/blobs_*.toc", ], }, - # Verify that data files were compressed + # Verify that TOC and data files were compressed glob_patterns => [ - "$tempdir/compression_lz4_dir/toc.dat", + "$tempdir/compression_lz4_dir/toc.dat.lz4", "$tempdir/compression_lz4_dir/*.dat.lz4", ], restore_cmd => [ @@ -239,18 +260,18 @@ my %pgdump_runs = ( '--statistics', 'postgres', ], - # Give coverage for manually compressed blobs.toc files during - # restore. + # Give coverage for manually-compressed TOC files during restore. compress_cmd => { program => $ENV{'ZSTD'}, args => [ - '-z', '-f', - '--rm', "$tempdir/compression_zstd_dir/blobs_*.toc", + '-z', '-f', '--rm', + "$tempdir/compression_zstd_dir/toc.dat", + "$tempdir/compression_zstd_dir/blobs_*.toc", ], }, - # Verify that data files were compressed + # Verify that TOC and data files were compressed glob_patterns => [ - "$tempdir/compression_zstd_dir/toc.dat", + "$tempdir/compression_zstd_dir/toc.dat.zst", "$tempdir/compression_zstd_dir/*.dat.zst", ], restore_cmd => [ @@ -333,14 +354,15 @@ my %tests = ( }, # Insert enough data to surpass DEFAULT_IO_BUFFER_SIZE during - # (de)compression operations + # (de)compression operations. The weird regex is because Perl + # restricts us to repeat counts of less than 32K. 'COPY test_compression_method' => { create_order => 111, create_sql => 'INSERT INTO test_compression_method (col1) ' - . 'SELECT string_agg(a::text, \'\') FROM generate_series(1,4096) a;', + . 'SELECT string_agg(a::text, \'\') FROM generate_series(1,65536) a;', regexp => qr/^ \QCOPY public.test_compression_method (col1) FROM stdin;\E - \n(?:\d{15277}\n){1}\\\.\n + \n(?:(?:\d\d\d\d\d\d\d\d\d\d){31657}\d\d\d\d\n){1}\\\.\n /xm, like => { %full_runs, }, }, @@ -502,8 +524,12 @@ foreach my $run (sort keys %pgdump_runs) foreach my $glob_pattern (@{$glob_patterns}) { my @glob_output = glob($glob_pattern); - is(scalar(@glob_output) > 0, - 1, "$run: glob check for $glob_pattern"); + my $ok = 0; + # certainly found some files if glob() returned multiple matches + $ok = 1 if (scalar(@glob_output) > 1); + # if just one match, we need to check if it's real + $ok = 1 if (scalar(@glob_output) == 1 && -f $glob_output[0]); + is($ok, 1, "$run: glob check for $glob_pattern"); } } -- 2.43.7
pgsql-hackers by date: