Re: optimize file transfer in pg_upgrade - Mailing list pgsql-hackers

From Greg Sabino Mullane
Subject Re: optimize file transfer in pg_upgrade
Date
Msg-id CAKAnmm+i3Q1pZ05N_b8=S3B=rztQDn--HoW8BRKVtCg53r8NiQ@mail.gmail.com
Whole thread Raw
Responses Re: optimize file transfer in pg_upgrade
List pgsql-hackers
On Wed, Nov 6, 2024 at 5:07 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
Therefore, it can be much faster to instead move the entire data directory from the old cluster
to the new cluster and to then swap the catalog relation files.

Thank you for breaking this up so clearly into separate commits. I think it is a very interesting idea, and anything to speed up pg_upgrade is always welcome. Some minor thoughts:

> [PATCH v1 3/8] Introduce catalog-swap mode for pg_upgrade.
> .. we don't really expect there to be directories within database directories,
> so perhaps it would be better to either unconditionally rename or to fail.
Failure seems the best option here, so we can cleanly handle any future cases in which we decide to put dirs in this directory.

> if (RelFileNumberIsValid(rfn))
> {
>   FileNameMap key;
>
>   key.relfilenumber = (RelFileNumber) rfn;
>   if (bsearch(&key, context->maps, context->size,
>      sizeof(FileNameMap), FileNameMapCmp))
>    return 0;
>  }
>
>  snprintf(dst, sizeof(dst), "%s/%s", context->target, filename);
>  if (rename(fname, dst) != 0)

I'm not quite clear what we are doing here with falling through for InvalidOid entries, could you explain?

> .. vm_must_add_frozenbit isn't handled yet.  We could either disallow
> using catalog-swap mode if the upgrade involves versions older than v9.6

Yes, this. No need for more code to handle super old versions when other options exist.

with this problem is to introduce a special mode for "initdb --sync-only"
that calls fsync() for everything _except_ the actual data files.  If we
fsync() the new catalog files as we move them into place, and if we assume
that the old catalog files will have been properly synchronized before
upgrading, there's no reason to synchronize them again at the end.

 Very cool approach!

Cheers,
Greg

pgsql-hackers by date:

Previous
From: Alexander Lakhin
Date:
Subject: deadlock-soft isolation test is failing on newest Fedora
Next
From: Andy Fan
Date:
Subject: Re: Extract numeric filed in JSONB more effectively