Re: pg_upgrade: Support for upgrading to checksums enabled - Mailing list pgsql-hackers

From Ilya Kosmodemiansky
Subject Re: pg_upgrade: Support for upgrading to checksums enabled
Date
Msg-id CAG95seV-sR7fsxRDE++SL2fot3Wuok05QhazrggQfaCF4Sd9CQ@mail.gmail.com
Whole thread Raw
List pgsql-hackers
Hi Peter,

I've applied and tested your patch, it works at least on MacOS with
meson build. A couple of thoughts about this patch inline below.

On Mon, Aug 26, 2024 at 8:23 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> The purpose of this patch is to allow using pg_upgrade between clusters
> that have different checksum settings.  When upgrading between instances
> with different checksum settings, the --copy (default) mode
> automatically sets (or unsets) the checksum on the fly.

I think the entire idea of this patch is great because it allows us to
remove an additional step in upgrade procedure, i.e. enabling
checksums before upgrade. A part about which I am not quite sure, is
"automatically". It is sufficient in most cases, but maybe also to
have an explicit flag would be a nice option as well.

in the patch itself:

> -        * We might eventually allow upgrades from checksum to no-checksum
> -        * clusters.
> -        */
> -       if (oldctrl->data_checksum_version == 0 &&
> -               newctrl->data_checksum_version != 0)
> -               pg_fatal("old cluster does not use data checksums but the new one does");
> -       else if (oldctrl->data_checksum_version != 0 &&
> -                        newctrl->data_checksum_version == 0)
> -               pg_fatal("old cluster uses data checksums but the new one does not");
> -       else if (oldctrl->data_checksum_version != newctrl->data_checksum_version)
> -               pg_fatal("old and new cluster pg_controldata checksum versions do not match");
> +       if (oldctrl->data_checksum_version != newctrl->data_checksum_version)
> +       {
> +               if (user_opts.transfer_mode != TRANSFER_MODE_COPY)
> +                       pg_fatal("when upgrading between clusters with different data checksum settings, transfer
mode--copy must be used"); 
> +       }

I've tried to recall when I see the previous error message "old and
new cluster pg_controldata checksum versions do not match" at most. It
was almost always pg_upgrade with --link option, because we mostly use
pg_upgrade when copy is barely an option. Previous error message gave
a clear statement: go one step behind and enable checksums. The new
one gives imo a wrong message: "your only option now is to use copy".

Would it be better to polish wording in direction "when upgrading
between clusters with different data checksum settings, transfer mode
--copy must be used to enable checksum automatically" or "when
upgrading between clusters with different data checksum settings,
transfer mode --copy must be used or data checksum settings of the old
cluster must be changed manually before upgrade"?


> Some discussion points:
>
> - We have added a bunch of different transfer modes to pg_upgrade in
> order to give the user control over the expected file transfer
> performance.  Here, I have added this checksum rewriting to the existing
> --copy mode, and I have measured about a 5% overhead.  An alternative
> would be to make this an explicit mode (--copy-slow,
> --copy-and-make-adjustments).

Maybe a separate -k flag to enable this behavior explicitly?

best regards,
Ilya


--
Ilya Kosmodemiansky
CEO, Founder

Data Egret GmbH
Your remote PostgreSQL DBA team
T.: +49 6821 919 3297
ik@dataegret.com



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: how to log into commitfest.postgresql.org and begin review patch
Next
From: Amit Kapila
Date:
Subject: Re: Conflict Detection and Resolution