Re: pg_upgrade: Support for upgrading to checksums enabled - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: pg_upgrade: Support for upgrading to checksums enabled
Date
Msg-id Zs5LpYAVNks8B6oZ@nathan
Whole thread Raw
List pgsql-hackers
On Mon, Aug 26, 2024 at 08:23:44AM +0200, Peter Eisentraut wrote:
> The purpose of this patch is to allow using pg_upgrade between clusters that
> have different checksum settings.  When upgrading between instances with
> different checksum settings, the --copy (default) mode automatically sets
> (or unsets) the checksum on the fly.
> 
> This would be particularly useful if we switched to enabling checksums by
> default, as [0] proposes, but it's also useful without that.

Given enabling checksums can be rather expensive, I think it makes sense to
add a way to do it during pg_upgrade versus asking folks to run
pg_checksums separately.  I'd anticipate arguments against enabling
checksums automatically, but as you noted, we can move it to a separate
option (e.g., --copy --enable-checksums).  Disabling checksums with
pg_checksums is fast because it just updates pg_control, so I don't see any
need for --disable-checkums in pg_upgrade.

> - Windows has a separate code path in the --copy mode.  I don't know the
> reasons or advantages of that.  So it's not clear how the checksum rewriting
> mode should be handled in that case.  We could switch to the non-Windows
> code path in that case, but then the performance difference between the
> normal path and the checksum-rewriting path is even more unclear.

AFAICT the separate Windows path dates back to before pg_upgrade was first
added to the Postgres tree, and unfortunately I couldn't find any
discussion about it.

-- 
nathan



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
Next
From: Tomas Vondra
Date:
Subject: Re: PoC: prefetching data between executor nodes (e.g. nestloop + indexscan)