Re: optimize file transfer in pg_upgrade - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: optimize file transfer in pg_upgrade
Date
Msg-id Z8Ihz3by-r5t1m30@nathan
Whole thread Raw
In response to Re: optimize file transfer in pg_upgrade  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, Feb 28, 2025 at 03:37:49PM -0500, Robert Haas wrote:
> On Fri, Feb 28, 2025 at 3:01 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> That's exactly where I landed (see v3-0002).  I haven't measured whether
>> transferring relfilenodes or dumping the sequence data is faster for the
>> existing modes, but for now I've left those alone, i.e., they still dump
>> sequence data.  The new "swap" mode just uses the old cluster's sequence
>> files, and I've disallowed using swap mode for upgrades from <v10 to avoid
>> the sequence tuple format change (along with other incompatible changes).
> 
> Ah. Perhaps I should have read the thread more carefully before
> commenting. Sounds good, at any rate.

On the contrary, I'm glad you independently came to the same conclusion.

>> I'll admit I'm a bit concerned that this will cause problems if and when
>> someone wants to change the sequence tuple format again.  But that hasn't
>> happened for a while, AFAIK nobody's planning to change it, and even if it
>> does happen, we just need to have my proposed new mode transfer the
>> sequence files like it transfers the catalog files.  That will make this
>> mode slower, especially if you have a ton of sequences, but maybe it'll
>> still be a win in most cases.  Of course, we probably will need to have
>> pg_upgrade handle other kinds of format changes, too, but IMHO it's still
>> worth trying to speed up pg_upgrade despite the potential future
>> complexities.
> 
> I think it's fine. If somebody comes along and says "hey, when v23
> came out Nathan's feature only sped up pg_upgrade by 2x instead of 3x
> like it did for v22, so Nathan is a bad person," I think we can fairly
> reply "thanks for sharing your opinion, feel free not to use the
> feature and run at 1x speed". There's no rule saying that every
> optimization must always produce the maximum possible benefit in every
> scenario. We're just concerned about regressions, and "only delivers
> some of the speedup if the sequence format has changed on disk" is not
> a regression.

Cool.  I appreciate the design feedback.

-- 
nathan



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: vacuumdb changes for stats import/export
Next
From: Jeff Davis
Date:
Subject: Re: Statistics Import and Export