Thread: Re: optimize file transfer in pg_upgrade

Re: optimize file transfer in pg_upgrade

From
Robert Haas
Date:
On Wed, Nov 6, 2024 at 5:07 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> these user relation files will have the same name.  Therefore, it can be
> much faster to instead move the entire data directory from the old cluster
> to the new cluster and to then swap the catalog relation files.

This is a cool idea.

> Another interesting problem is that pg_upgrade currently doesn't transfer
> the sequence data files.  Since v10, we've restored these via pg_restore.
> I believe this was originally done for the introduction of the pg_sequence
> catalog, which changed the format of sequence tuples.  In the new
> catalog-swap mode I am proposing, this means we need to transfer all the
> pg_restore-generated sequence data files.  If there are many sequences, it
> can be difficult to determine which transfer mode and synchronization
> method will be faster.  Since sequence tuple modifications are very rare, I
> think the new catalog-swap mode should just use the sequence data files
> from the old cluster whenever possible.

Maybe we should rethink the decision not to transfer relfilenodes for
sequences. Or have more than one way to do it. pg_upgrade
--binary-upgrade --binary-upgrade-even-for-sequences, or whatever.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: optimize file transfer in pg_upgrade

From
Robert Haas
Date:
On Fri, Feb 28, 2025 at 2:40 PM Robert Haas <robertmhaas@gmail.com> wrote:
> Maybe we should rethink the decision not to transfer relfilenodes for
> sequences. Or have more than one way to do it. pg_upgrade
> --binary-upgrade --binary-upgrade-even-for-sequences, or whatever.

Sorry, I meant: pg_dump --binary-upgrade --binary-upgrade-even-for-sequences

i.e. pg_upgrade could decide which way to ask pg_dump to do it,
depending on versions and flags.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Fri, Feb 28, 2025 at 02:41:22PM -0500, Robert Haas wrote:
> On Fri, Feb 28, 2025 at 2:40 PM Robert Haas <robertmhaas@gmail.com> wrote:
>> Maybe we should rethink the decision not to transfer relfilenodes for
>> sequences. Or have more than one way to do it. pg_upgrade
>> --binary-upgrade --binary-upgrade-even-for-sequences, or whatever.
> 
> Sorry, I meant: pg_dump --binary-upgrade --binary-upgrade-even-for-sequences
> 
> i.e. pg_upgrade could decide which way to ask pg_dump to do it,
> depending on versions and flags.

That's exactly where I landed (see v3-0002).  I haven't measured whether
transferring relfilenodes or dumping the sequence data is faster for the
existing modes, but for now I've left those alone, i.e., they still dump
sequence data.  The new "swap" mode just uses the old cluster's sequence
files, and I've disallowed using swap mode for upgrades from <v10 to avoid
the sequence tuple format change (along with other incompatible changes).

I'll admit I'm a bit concerned that this will cause problems if and when
someone wants to change the sequence tuple format again.  But that hasn't
happened for a while, AFAIK nobody's planning to change it, and even if it
does happen, we just need to have my proposed new mode transfer the
sequence files like it transfers the catalog files.  That will make this
mode slower, especially if you have a ton of sequences, but maybe it'll
still be a win in most cases.  Of course, we probably will need to have
pg_upgrade handle other kinds of format changes, too, but IMHO it's still
worth trying to speed up pg_upgrade despite the potential future
complexities.

-- 
nathan



Re: optimize file transfer in pg_upgrade

From
Robert Haas
Date:
On Fri, Feb 28, 2025 at 3:01 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> That's exactly where I landed (see v3-0002).  I haven't measured whether
> transferring relfilenodes or dumping the sequence data is faster for the
> existing modes, but for now I've left those alone, i.e., they still dump
> sequence data.  The new "swap" mode just uses the old cluster's sequence
> files, and I've disallowed using swap mode for upgrades from <v10 to avoid
> the sequence tuple format change (along with other incompatible changes).

Ah. Perhaps I should have read the thread more carefully before
commenting. Sounds good, at any rate.

> I'll admit I'm a bit concerned that this will cause problems if and when
> someone wants to change the sequence tuple format again.  But that hasn't
> happened for a while, AFAIK nobody's planning to change it, and even if it
> does happen, we just need to have my proposed new mode transfer the
> sequence files like it transfers the catalog files.  That will make this
> mode slower, especially if you have a ton of sequences, but maybe it'll
> still be a win in most cases.  Of course, we probably will need to have
> pg_upgrade handle other kinds of format changes, too, but IMHO it's still
> worth trying to speed up pg_upgrade despite the potential future
> complexities.

I think it's fine. If somebody comes along and says "hey, when v23
came out Nathan's feature only sped up pg_upgrade by 2x instead of 3x
like it did for v22, so Nathan is a bad person," I think we can fairly
reply "thanks for sharing your opinion, feel free not to use the
feature and run at 1x speed". There's no rule saying that every
optimization must always produce the maximum possible benefit in every
scenario. We're just concerned about regressions, and "only delivers
some of the speedup if the sequence format has changed on disk" is not
a regression.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Fri, Feb 28, 2025 at 03:37:49PM -0500, Robert Haas wrote:
> On Fri, Feb 28, 2025 at 3:01 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> That's exactly where I landed (see v3-0002).  I haven't measured whether
>> transferring relfilenodes or dumping the sequence data is faster for the
>> existing modes, but for now I've left those alone, i.e., they still dump
>> sequence data.  The new "swap" mode just uses the old cluster's sequence
>> files, and I've disallowed using swap mode for upgrades from <v10 to avoid
>> the sequence tuple format change (along with other incompatible changes).
> 
> Ah. Perhaps I should have read the thread more carefully before
> commenting. Sounds good, at any rate.

On the contrary, I'm glad you independently came to the same conclusion.

>> I'll admit I'm a bit concerned that this will cause problems if and when
>> someone wants to change the sequence tuple format again.  But that hasn't
>> happened for a while, AFAIK nobody's planning to change it, and even if it
>> does happen, we just need to have my proposed new mode transfer the
>> sequence files like it transfers the catalog files.  That will make this
>> mode slower, especially if you have a ton of sequences, but maybe it'll
>> still be a win in most cases.  Of course, we probably will need to have
>> pg_upgrade handle other kinds of format changes, too, but IMHO it's still
>> worth trying to speed up pg_upgrade despite the potential future
>> complexities.
> 
> I think it's fine. If somebody comes along and says "hey, when v23
> came out Nathan's feature only sped up pg_upgrade by 2x instead of 3x
> like it did for v22, so Nathan is a bad person," I think we can fairly
> reply "thanks for sharing your opinion, feel free not to use the
> feature and run at 1x speed". There's no rule saying that every
> optimization must always produce the maximum possible benefit in every
> scenario. We're just concerned about regressions, and "only delivers
> some of the speedup if the sequence format has changed on disk" is not
> a regression.

Cool.  I appreciate the design feedback.

-- 
nathan



Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Fri, Feb 28, 2025 at 02:51:27PM -0600, Nathan Bossart wrote:
> Cool.  I appreciate the design feedback.

One other design point I wanted to bring up is whether we should bother
generating a rollback script for the new "swap" mode.  In short, I'm
wondering if it would be unreasonable to say that, just for this mode, once
pg_upgrade enters the file transfer step, reverting to the old cluster
requires restoring a backup.  I believe that's worth considering for the
following reasons:

* Anecdotally, I'm not sure I've ever actually seen pg_upgrade fail during
  or after file transfer, and I'm hoping to get some real data about that
  in the near future.  Has anyone else dealt with such a failure?  I
  suspect that failures during file transfer are typically due to OS
  crashes, power losses, etc., and hopefully those are rare.

* I've spent quite some time trying to generate a portable script, but it's
  quite complicated and difficult to reason about its correctness.  And I
  haven't even started on the Windows version.  Leaving this part out would
  simplify the patch set quite a bit.

* If we give up the idea of reverting to the old cluster, we also can avoid
  a bunch of intermediate fsync() calls which I only included to help
  reason about the state of the files in case you failed halfway through.
  This might not add up to much, but it's at least another area of
  simplification.

Of course, rollback would still be possible, but you'd really need to
understand what "swap" mode does behind the scenes to do so safely.  In any
case, I'm growing skeptical that a probably-not-super-well-tested script
that extremely few people will need and fewer will use is worth the
complexity.

-- 
nathan



Re: optimize file transfer in pg_upgrade

From
Robert Haas
Date:
On Wed, Mar 5, 2025 at 2:42 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> Of course, rollback would still be possible, but you'd really need to
> understand what "swap" mode does behind the scenes to do so safely.  In any
> case, I'm growing skeptical that a probably-not-super-well-tested script
> that extremely few people will need and fewer will use is worth the
> complexity.

I don't have a super-strong view on what the right thing to do is
here, but I'm definitely in favor of not doing something half-baked.
If you think the revert script is going to suck, then let's not have
it at all and just be clear about what the requirements for using this
mode are.

One serious danger that you didn't mention here is that, if pg_upgrade
does fail, you may well try several times. And if you forget the
revert script at some point, or run it more than once, or run the
wrong version, you will be in trouble. I feel like this is something
someone could very easily mess up even if in theory it works
perfectly. Upgrades are often stressful times.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: optimize file transfer in pg_upgrade

From
Greg Sabino Mullane
Date:
On Wed, Mar 5, 2025 at 2:43 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
One other design point I wanted to bring up is whether we should bother generating a rollback script for the new "swap" mode.  In short, I'm wondering if it would be unreasonable to say that, just for this mode, once pg_upgrade enters the file transfer step, reverting to the old cluster requires restoring a backup.

I think that's a fair requirement. And like Robert, revert scripts make me nervous.

* Anecdotally, I'm not sure I've ever actually seen pg_upgrade fail during or after file transfer, and I'm hoping to get some real data about that in the near future.  Has anyone else dealt with such a failure?

I've seen various failures, but they always get caught quite early. Certainly early enough to easily abort, fix perms/mounts/etc., then retry. I think your instinct is correct that this reversion is more trouble than its worth. I don't think the pg_upgrade docs mention taking a backup, but that's always step 0 in my playbook, and that's the rollback plan in the unlikely event of failure.

Cheers,
Greg

--
Enterprise Postgres Software Products & Tech Support

Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Wed, Mar 05, 2025 at 03:40:52PM -0500, Greg Sabino Mullane wrote:
> I've seen various failures, but they always get caught quite early.
> Certainly early enough to easily abort, fix perms/mounts/etc., then retry.
> I think your instinct is correct that this reversion is more trouble than
> its worth. I don't think the pg_upgrade docs mention taking a backup, but
> that's always step 0 in my playbook, and that's the rollback plan in the
> unlikely event of failure.

Thank you, Greg and Robert, for sharing your thoughts.  With that, here's
what I'm considering to be a reasonably complete patch set for this
feature.  This leaves about a month for rigorous testing and editing, so
I'm hopeful it'll be ready v18.

-- 
nathan

Attachment

Re: optimize file transfer in pg_upgrade

From
Bruce Momjian
Date:
On Wed, Mar  5, 2025 at 03:40:52PM -0500, Greg Sabino Mullane wrote:
> On Wed, Mar 5, 2025 at 2:43 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> 
>     One other design point I wanted to bring up is whether we should bother
>     generating a rollback script for the new "swap" mode.  In short, I'm
>     wondering if it would be unreasonable to say that, just for this mode, once
>     pg_upgrade enters the file transfer step, reverting to the old cluster
>     requires restoring a backup.
> 
> 
> I think that's a fair requirement. And like Robert, revert scripts make me
> nervous.
> 
> 
>     * Anecdotally, I'm not sure I've ever actually seen pg_upgrade fail
>     during or after file transfer, and I'm hoping to get some real data about
>     that in the near future.  Has anyone else dealt with such a failure?
> 
> 
> I've seen various failures, but they always get caught quite early. Certainly
> early enough to easily abort, fix perms/mounts/etc., then retry. I think your
> instinct is correct that this reversion is more trouble than its worth. I don't
> think the pg_upgrade docs mention taking a backup, but that's always step 0 in
> my playbook, and that's the rollback plan in the unlikely event of failure.

I avoided many optimizations in pg_upgrade in the fear they would lead
to hard-to-detect bugs, or breakage from major release changes. 
pg_upgrade is probably old enough now (15 years) that we can risk these
optimizations.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.



Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Wed, Mar 05, 2025 at 08:34:37PM -0600, Nathan Bossart wrote:
> Thank you, Greg and Robert, for sharing your thoughts.  With that, here's
> what I'm considering to be a reasonably complete patch set for this
> feature.  This leaves about a month for rigorous testing and editing, so
> I'm hopeful it'll be ready v18.

Here are my notes after a round of self-review.

0001:
* The documentation does not adequately describe the interaction between
  --no-sync-data-files and --sync-method=syncfs.
* I really don't like the exclude_dir hack for skipping the main tablespace
  directory, but I haven't thought of anything that seems better.
* I should verify that there's no path separator issues on Windows for the
  exclude_dir hack.  From some quick code analysis, I think it should work
  fine, but I probably ought to test it out to be sure.
* The documentation needs to mention that the tablespace directories
  themselves are not synchronized.

0002:
* The documentation changes are subject to update based on ongoing stats
  import/export work.
* Does --statistics-only --sequence-data make any sense?  It seems like it
  ought to function as expected, but it's hard to see a use-case.

0003:
* Once committed, I should update one of my buildfarm animals to use
  PG_TEST_PG_UPGRADE_MODE=--swap.
* For check_hard_link() and disable_old_cluster(), move the Assert() to an
  "else" block with a pg_fatal() call for sturdiness.
* I need to do a thorough pass-through on all comments.  Many are not
  sufficiently detailed.
* The "." and ".." checks in the catalog swap code are redundant and can be
  removed.
* The directory for "moved-aside" stuff should be placed within the old
  cluster's corresponding tablespace directory so that no changes need to
  be made to delete_old_cluster.{sh,bat}.
* Manual testing with non-default tablespaces!

Updated patches based on these notes are attached.

-- 
nathan

Attachment

Re: optimize file transfer in pg_upgrade

From
Robert Haas
Date:
On Mon, Mar 17, 2025 at 3:34 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> * Once committed, I should update one of my buildfarm animals to use
>   PG_TEST_PG_UPGRADE_MODE=--swap.

It would be better if we didn't need a separate buildfarm animal to
test this, because that means you won't get notified by local testing
OR by CI if you break this. Can we instead have one test that checks
this which is part of the normal test run?

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Mon, Mar 17, 2025 at 04:04:45PM -0400, Robert Haas wrote:
> On Mon, Mar 17, 2025 at 3:34 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> * Once committed, I should update one of my buildfarm animals to use
>>   PG_TEST_PG_UPGRADE_MODE=--swap.
> 
> It would be better if we didn't need a separate buildfarm animal to
> test this, because that means you won't get notified by local testing
> OR by CI if you break this. Can we instead have one test that checks
> this which is part of the normal test run?

That's what I set out to do before I discovered PG_TEST_PG_UPGRADE_MODE.
The commit message for b059a24 seemed to indicate that we don't want to
automatically test all supported modes, but I agree that it would be nice
to have some basic coverage for --swap on CI/buildfarm regardless of
PG_TEST_PG_UPGRADE_MODE.  How about we add a simple TAP test (attached),
and I still plan on switching a buildfarm animal to --swap for more
in-depth testing?

-- 
nathan

Attachment

Re: optimize file transfer in pg_upgrade

From
Robert Haas
Date:
On Mon, Mar 17, 2025 at 4:30 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> On Mon, Mar 17, 2025 at 04:04:45PM -0400, Robert Haas wrote:
> > On Mon, Mar 17, 2025 at 3:34 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> >> * Once committed, I should update one of my buildfarm animals to use
> >>   PG_TEST_PG_UPGRADE_MODE=--swap.
> >
> > It would be better if we didn't need a separate buildfarm animal to
> > test this, because that means you won't get notified by local testing
> > OR by CI if you break this. Can we instead have one test that checks
> > this which is part of the normal test run?
>
> That's what I set out to do before I discovered PG_TEST_PG_UPGRADE_MODE.
> The commit message for b059a24 seemed to indicate that we don't want to
> automatically test all supported modes, but I agree that it would be nice
> to have some basic coverage for --swap on CI/buildfarm regardless of
> PG_TEST_PG_UPGRADE_MODE.  How about we add a simple TAP test (attached),
> and I still plan on switching a buildfarm animal to --swap for more
> in-depth testing?

The background here is that I'm kind of on the warpath against weird
configurations that we only test on certain buildfarm animals at the
moment, because the result of that is that CI is clean and then the
buildfarm turns red when you commit. That's an unenjoyable experience
for the committer and for everyone who looks at the buildfarm results.
The way to fix it is to stop relying on "rerun all the tests with this
weird mode flag" and rely more on tests that are designed to test that
specific flag and, ideally, that get run by in local testing or at
least by CI.

I'm not quite sure what the best thing is to do is for the pg_upgrade
tests in particular, and it may well be best to do as you propose for
now and figure that out later. But I question whether just rerunning
all of those tests with several different mode flags is the right
thing to do. Why for example does 005_char_signedness.pl need to be
checked under both --link and --clone? I would guess that there are
one or maybe two tests in src/bin/pg_upgrade/t that needs to test
--link and --clone and they should grow internal loops to do that
(when supported by the local platform) and PG_UPGRADE_TEST_MODE should
go in the garbage.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: optimize file transfer in pg_upgrade

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I'm not quite sure what the best thing is to do is for the pg_upgrade
> tests in particular, and it may well be best to do as you propose for
> now and figure that out later. But I question whether just rerunning
> all of those tests with several different mode flags is the right
> thing to do. Why for example does 005_char_signedness.pl need to be
> checked under both --link and --clone? I would guess that there are
> one or maybe two tests in src/bin/pg_upgrade/t that needs to test
> --link and --clone and they should grow internal loops to do that
> (when supported by the local platform) and PG_UPGRADE_TEST_MODE should
> go in the garbage.

+1

I'd be particularly allergic to running 002_pg_upgrade.pl multiple
times, as that's one of our most expensive tests, and I flat out
don't believe that expending that many cycles could be justified.
Surely we can test these modes sufficiently in some much cheaper and
more targeted way.

            regards, tom lane



Re: optimize file transfer in pg_upgrade

From
Andres Freund
Date:
Hi,

On 2025-03-18 10:04:41 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > I'm not quite sure what the best thing is to do is for the pg_upgrade
> > tests in particular, and it may well be best to do as you propose for
> > now and figure that out later. But I question whether just rerunning
> > all of those tests with several different mode flags is the right
> > thing to do. Why for example does 005_char_signedness.pl need to be
> > checked under both --link and --clone? I would guess that there are
> > one or maybe two tests in src/bin/pg_upgrade/t that needs to test
> > --link and --clone and they should grow internal loops to do that
> > (when supported by the local platform) and PG_UPGRADE_TEST_MODE should
> > go in the garbage.
>
> +1
>
> I'd be particularly allergic to running 002_pg_upgrade.pl multiple
> times, as that's one of our most expensive tests, and I flat out
> don't believe that expending that many cycles could be justified.
> Surely we can test these modes sufficiently in some much cheaper and
> more targeted way.

+1

It's useful to have coverage of as many object types as possible in pg_upgrade
- hence 002_pg_upgrade.pl. It helps us to find problems in new code that
didn't think about pg_upgrade.

But that doesn't mean that it's a good idea to run all other pg_upgrade tests
the same way, to the contrary - the cost is too high.

Even leaving runtime aside, I have a hard time believing that --link, --clone,
--swap benefit from running the same way as 002_pg_upgrade.pl - the
implementation of those flags is on a lower level and works the same across
e.g. different index AMs.

I'd go so far as to say that 002_pg_upgrade.pl style testing actually makes it
*harder* to diagnose problems related to things like --link, because there are
no targeted tests, but just a huge set of things that maybe allow to infer
some bug if you spend a lot of time.

Greetings,

Andres Freund



Re: optimize file transfer in pg_upgrade

From
Álvaro Herrera
Date:
On 2025-Mar-18, Robert Haas wrote:

> The background here is that I'm kind of on the warpath against weird
> configurations that we only test on certain buildfarm animals at the
> moment, because the result of that is that CI is clean and then the
> buildfarm turns red when you commit. That's an unenjoyable experience
> for the committer and for everyone who looks at the buildfarm results.
> The way to fix it is to stop relying on "rerun all the tests with this
> weird mode flag" and rely more on tests that are designed to test that
> specific flag and, ideally, that get run by in local testing or at
> least by CI.

FWIW this is exactly the rationale that got me writing an email on the
Ashutosh's thread for a new pg_dump/restore test under
002_pg_upgrade.pl, whereby I was saying that we should not hide it
behind PG_TEST_EXTRA which almost nobody would remember to use.  But I
discarded that draft, because that had actually been Ashutosh's idea at
some point in the thread and had been discarded because of the runtime
increase it'd cause.  But, somehow, I still don't believe the theory
that it's such a bad idea to add a few seconds so that we have such a
comprehensive pg_dump test, with much less programmer overhead than
pg_dump's own weird enormous test script.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Cada quien es cada cual y baja las escaleras como quiere" (JMSerrat)



Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Tue, Mar 18, 2025 at 10:12:51AM -0400, Andres Freund wrote:
> On 2025-03-18 10:04:41 -0400, Tom Lane wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>> > I'm not quite sure what the best thing is to do is for the pg_upgrade
>> > tests in particular, and it may well be best to do as you propose for
>> > now and figure that out later. But I question whether just rerunning
>> > all of those tests with several different mode flags is the right
>> > thing to do. Why for example does 005_char_signedness.pl need to be
>> > checked under both --link and --clone? I would guess that there are
>> > one or maybe two tests in src/bin/pg_upgrade/t that needs to test
>> > --link and --clone and they should grow internal loops to do that
>> > (when supported by the local platform) and PG_UPGRADE_TEST_MODE should
>> > go in the garbage.
>>
>> +1
>>
>> I'd be particularly allergic to running 002_pg_upgrade.pl multiple
>> times, as that's one of our most expensive tests, and I flat out
>> don't believe that expending that many cycles could be justified.
>> Surely we can test these modes sufficiently in some much cheaper and
>> more targeted way.
> 
> +1

Here is a first sketch at a test that cycles through all the transfer modes
and makes sure they succeed or fail with an error along the lines of "not
supported on this platform."  Each test verifies that some very simple
objects make it to the new version, which we could of course expand on.
Would something like this suffice?

-- 
nathan

Attachment

Re: optimize file transfer in pg_upgrade

From
Andres Freund
Date:
Hi,

On 2025-03-18 12:29:02 -0500, Nathan Bossart wrote:
> Here is a first sketch at a test that cycles through all the transfer modes
> and makes sure they succeed or fail with an error along the lines of "not
> supported on this platform."  Each test verifies that some very simple
> objects make it to the new version, which we could of course expand on.
> Would something like this suffice?

I'd add a few more complications:

- Create and test a relation that was rewritten, to ensure we test the
  relfilenode != oid case and one that isn't rewritten.

- Perhaps create a tablespace?

- Do we need a new old cluster for each of the modes? That seems like wasted
  time?  I guess it's required for --link...


Greetings,

Andres Freund



Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Tue, Mar 18, 2025 at 01:37:02PM -0400, Andres Freund wrote:
> I'd add a few more complications:
> 
> - Create and test a relation that was rewritten, to ensure we test the
>   relfilenode != oid case and one that isn't rewritten.

+1

> - Perhaps create a tablespace?

+1, I don't think we have much, if any, coverage of pg_upgrade with
non-default tablespaces.

> - Do we need a new old cluster for each of the modes? That seems like wasted
>   time?  I guess it's required for --link...

It'll also be needed for --swap.  We could optionally save the old cluster
for a couple of modes if we really wanted to.  *shrug*

I'll work on the first two...

-- 
nathan



Re: optimize file transfer in pg_upgrade

From
Andres Freund
Date:
On 2025-03-18 12:47:01 -0500, Nathan Bossart wrote:
> On Tue, Mar 18, 2025 at 01:37:02PM -0400, Andres Freund wrote:
> > - Do we need a new old cluster for each of the modes? That seems like wasted
> >   time?  I guess it's required for --link...
> 
> It'll also be needed for --swap.  We could optionally save the old cluster
> for a couple of modes if we really wanted to.  *shrug*

Don't worry about it, I think the template initdb stuff should make it cheap enough...



Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Tue, Mar 18, 2025 at 01:50:10PM -0400, Andres Freund wrote:
> On 2025-03-18 12:47:01 -0500, Nathan Bossart wrote:
>> On Tue, Mar 18, 2025 at 01:37:02PM -0400, Andres Freund wrote:
>> > - Do we need a new old cluster for each of the modes? That seems like wasted
>> >   time?  I guess it's required for --link...
>> 
>> It'll also be needed for --swap.  We could optionally save the old cluster
>> for a couple of modes if we really wanted to.  *shrug*
> 
> Don't worry about it, I think the template initdb stuff should make it cheap enough...

Cool.  I realize now why there's poor coverage for pg_upgrade with
tablespaces: you can't upgrade between the same version with tablespaces
(presumably due to the version-specific subdirectory conflict).  I don't
know if the regression tests leave around any tablespaces for the
cross-version pg_upgrade tests, but that's probably the best we can do at
the moment.

For now, here's a new version of the test with a rewritten table.  I also
tried to fix the expected error regex to handle some of the other error
messages for unsupported modes (as revealed by cfbot).

-- 
nathan

Attachment

Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Tue, Mar 18, 2025 at 02:08:42PM -0500, Nathan Bossart wrote:
> For now, here's a new version of the test with a rewritten table.  I also
> tried to fix the expected error regex to handle some of the other error
> messages for unsupported modes (as revealed by cfbot).

And here is a new version of the full patch set.

-- 
nathan

Attachment

Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Tue, Mar 18, 2025 at 09:14:22PM -0500, Nathan Bossart wrote:
> And here is a new version of the full patch set.

I'm currently planning to commit this sometime early-ish next week.  One
notable loose end is the lack of a pg_upgrade test with a non-default
tablespace, but that is an existing problem that IMHO is best handled
separately (since we can only test it in cross-version upgrades).

-- 
nathan



Re: optimize file transfer in pg_upgrade

From
Andres Freund
Date:
Hi,

On 2025-03-18 21:14:22 -0500, Nathan Bossart wrote:
> From 8b6a5e0148c2f7a663f5003f12ae9461d2b06a5c Mon Sep 17 00:00:00 2001
> From: Nathan Bossart <nathan@postgresql.org>
> Date: Tue, 18 Mar 2025 20:58:07 -0500
> Subject: [PATCH v7 1/4] Add test for pg_upgrade file transfer modes.
> 
> This new test checks all of pg_upgrade's file transfer modes.  For
> each mode, we verify that pg_upgrade either succeeds (and some test
> objects successfully reach the new version) or fails with an error
> that indicates the mode is not supported on the current platform.

LGTM.  I'm sure we could do more than the test does today, but I think it's a
good improvement.

Greetings,

Andres Freund



Re: optimize file transfer in pg_upgrade

From
Tom Lane
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:
> I'm currently planning to commit this sometime early-ish next week.  One
> notable loose end is the lack of a pg_upgrade test with a non-default
> tablespace, but that is an existing problem that IMHO is best handled
> separately (since we can only test it in cross-version upgrades).

Agreed that that shouldn't block this, but we need some kind of
plan for testing it better.

            regards, tom lane



Re: optimize file transfer in pg_upgrade

From
Andres Freund
Date:
Hi,

On 2025-03-19 12:28:33 -0400, Tom Lane wrote:
> Nathan Bossart <nathandbossart@gmail.com> writes:
> > I'm currently planning to commit this sometime early-ish next week.  One
> > notable loose end is the lack of a pg_upgrade test with a non-default
> > tablespace, but that is an existing problem that IMHO is best handled
> > separately (since we can only test it in cross-version upgrades).
>
> Agreed that that shouldn't block this, but we need some kind of
> plan for testing it better.

Yea, this is really suboptimal.

Shouldn't allow_in_place_tablespaces be sufficient to deal with that scenario?
Or at least it should make it reasonably easy to cope if it doesn't already
suffice?

Greetings,

Andres Freund



Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Wed, Mar 19, 2025 at 12:44:38PM -0400, Andres Freund wrote:
> On 2025-03-19 12:28:33 -0400, Tom Lane wrote:
>> Nathan Bossart <nathandbossart@gmail.com> writes:
>> > I'm currently planning to commit this sometime early-ish next week.  One
>> > notable loose end is the lack of a pg_upgrade test with a non-default
>> > tablespace, but that is an existing problem that IMHO is best handled
>> > separately (since we can only test it in cross-version upgrades).
>>
>> Agreed that that shouldn't block this, but we need some kind of
>> plan for testing it better.
> 
> Yea, this is really suboptimal.
> 
> Shouldn't allow_in_place_tablespaces be sufficient to deal with that scenario?
> Or at least it should make it reasonably easy to cope if it doesn't already
> suffice?

Unfortunately, pg_upgrade can't yet handle in-place tablespaces.  One
reason is that pg_tablespace_location() returns a relative path for those
(e.g., "pg_tblspc/123456").  We'd also need to adjust init_tablespaces() to
not fail if all the tablespaces are in-place.  There may be other reasons,
too.  I'm confident we could get it working, but I'm not too excited about
trying to sneak this into v18.

In addition to testing with in-place tablespaces, we might also want to
teach the transfer modes test to do cross-version testing when possible.
In that case, we can test normal (non-in-place) tablespaces.  However, that
would be limited to the buildfarm.

Does this seem like a reasonable plan for v19?

-- 
nathan



Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Wed, Mar 19, 2025 at 02:32:01PM -0500, Nathan Bossart wrote:
> In addition to testing with in-place tablespaces, we might also want to
> teach the transfer modes test to do cross-version testing when possible.
> In that case, we can test normal (non-in-place) tablespaces.  However, that
> would be limited to the buildfarm.

Actually, this one was pretty easy to do.

-- 
nathan

Attachment

Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Wed, Mar 19, 2025 at 04:28:23PM -0500, Nathan Bossart wrote:
> On Wed, Mar 19, 2025 at 02:32:01PM -0500, Nathan Bossart wrote:
>> In addition to testing with in-place tablespaces, we might also want to
>> teach the transfer modes test to do cross-version testing when possible.
>> In that case, we can test normal (non-in-place) tablespaces.  However, that
>> would be limited to the buildfarm.
> 
> Actually, this one was pretty easy to do.

And here is yet another new version of the full patch set.  I'm planning to
commit 0001 (the new pg_upgrade transfer mode test) tomorrow so that I can
deal with any buildfarm indigestion before committing swap mode.  I did run
the test locally for upgrades from v9.6, v13, and v17, but who knows what
unique configurations I've failed to anticipate...

-- 
nathan

Attachment

Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Wed, Mar 19, 2025 at 09:02:42PM -0500, Nathan Bossart wrote:
> On Wed, Mar 19, 2025 at 04:28:23PM -0500, Nathan Bossart wrote:
>> On Wed, Mar 19, 2025 at 02:32:01PM -0500, Nathan Bossart wrote:
>>> In addition to testing with in-place tablespaces, we might also want to
>>> teach the transfer modes test to do cross-version testing when possible.
>>> In that case, we can test normal (non-in-place) tablespaces.  However, that
>>> would be limited to the buildfarm.
>> 
>> Actually, this one was pretty easy to do.
> 
> And here is yet another new version of the full patch set.  I'm planning to
> commit 0001 (the new pg_upgrade transfer mode test) tomorrow so that I can
> deal with any buildfarm indigestion before committing swap mode.  I did run
> the test locally for upgrades from v9.6, v13, and v17, but who knows what
> unique configurations I've failed to anticipate...

As promised, I've committed just 0001 for now.  I'll watch closely for any
issues in the buildfarm.

-- 
nathan



Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Thu, Mar 20, 2025 at 11:11:46AM -0500, Nathan Bossart wrote:
> As promised, I've committed just 0001 for now.  I'll watch closely for any
> issues in the buildfarm.

Seeing none, here's is a rebased patch set without 0001.  The only changes
are some fleshed-out comments and commit messages.  I'm still aiming to
commit this sometime early next week.

-- 
nathan

Attachment

Re: optimize file transfer in pg_upgrade

From
Nathan Bossart
Date:
On Thu, Mar 20, 2025 at 03:23:13PM -0500, Nathan Bossart wrote:
> I'm still aiming to commit this sometime early next week.

Committed.  Thanks to everyone who chimed in on this thread.

While writing the attributions, I noticed that nobody seems to have
commented specifically on 0001.  The closest thing to a review I see is
Greg's note upthread [0].  This patch is a little bigger than what I'd
ordinarily feel comfortable with committing unilaterally, but it's been
posted in its current form since February 28th, this thread has gotten a
decent amount of traffic since then, and it's not a huge change ("9 files
changed, 96 insertions(+), 37 deletions(-)").  I'm happy to address any
post-commit feedback that folks have.  As noted earlier [1], I'm not wild
about how it's implemented, but this is the nicest approach I've thought of
thus far.

I also wanted to draw attention to this note in 0003:

        /*
         * XXX: The below line is a hack to deal with the fact that we
          * presently don't have an easy way to find the corresponding new
          * tablespace's path.  This will need to be fixed if/when we add
          * pg_upgrade support for in-place tablespaces.
          */
         new_tablespace = old_tablespace;

I intend to address this in v19, primarily to enable same-version
pg_upgrade testing with non-default tablespaces.  My current thinking is
that we should have pg_upgrade also gather the new cluster tablespace
information and map them to the corresponding tablespaces on the old
cluster.  This might require some refactoring in pg_upgrade.  In any case,
I didn't feel this should block the feature for v18.

[0] https://postgr.es/m/CAKAnmm%2Bi3Q1pZ05N_b8%3DS3B%3DrztQDn--HoW8BRKVtCg53r8NiQ%40mail.gmail.com
[1] https://postgr.es/m/Z9h5Spp76EBygyEL%40nathan

-- 
nathan