Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward - Mailing list pgsql-hackers

From Dimitrios Apostolou
Subject Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward
Date
Msg-id 9qn05q5r-s53q-8o3s-0313-p1s94r64oq9r@tzk.arg
Whole thread Raw
In response to Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wednesday 2025-10-15 21:21, Tom Lane wrote:

>> 0004 increases the row width in the existing test case that says
>> it's trying to push more than DEFAULT_IO_BUFFER_SIZE through
>> the compressors.  While I agree with the premise, this solution
>> is hugely expensive: it adds about 12% to the already-long runtime
>> of 002_pg_dump.pl.  I'd like to find a better way, but ran out of
>> energy for today.  (I think the reason this costs so much is that
>> it's effectively iterated hundreds of times because of
>> 002_pg_dump.pl's more or less cross-product approach to testing
>> everything.  Maybe we should pull it out of that structure?)
>
> The attached patchset accomplishes that by splitting 002_pg_dump.pl
> into two scripts, one that is just concerned with the compression
> test cases and one that does everything else.  This might not be
> the prettiest solution, since it duplicates a lot of perl code.
> I thought about refactoring 002_pg_dump.pl so that it could handle
> two separate sets of runs-plus-tests, but decided it was overly
> complicated already.
>
> Anyway, 0001 attached is the same as in v4, 0002 performs the
> test split without intending to change coverage, and then 0003
> adds the new test cases I wanted.  For me, this ends up with
> just about the same runtime as before, or maybe a smidge less.
> I'd hoped for possibly more savings than that, but I'm content
> with it being a wash.
>
> I think this is more or less committable, and then we could get
> back to the original question of whether it's worth tweaking
> pg_restore's seek-vs-scan behavior.


Hi Tom, since you are dealing with pg_restore testing, you might want to
have a look in the 2nd patch from here:

https://www.postgresql.org/message-id/413c1cd8-1d6d-90ba-ac7b-b226a4dad5ed%40gmx.net

Direct link to the patch is:

https://www.postgresql.org/message-id/attachment/177661/v3-0002-Add-new-test-file-with-pg_restore-test-cases.patch


It's a much shorter test, focused on pg_restore.

1. It generates two custom-format dumps (with-TOC and TOC-less).

2. Restores each dump to an empty database using pg_restore with
    a couple of switches combinations
    (one combination (--clean --data-only will not work without a patch
     of mine so we might want to remove that and enrich with others).

3. Tests pg_restore over pre-existing database

4. Tests pg_restore reading file from stdin.


Regards,
Dimitris




pgsql-hackers by date:

Previous
From: Nazir Bilal Yavuz
Date:
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
Next
From: "David E. Wheeler"
Date:
Subject: Re: abi-compliance-check failure due to recent changes to pg_{clear,restore}_{attribute,relation}_stats()