Thread: Re: pg_upgrade: Support for upgrading to checksums enabled
On Mon, Aug 26, 2024 at 08:23:44AM +0200, Peter Eisentraut wrote: > The purpose of this patch is to allow using pg_upgrade between clusters that > have different checksum settings. When upgrading between instances with > different checksum settings, the --copy (default) mode automatically sets > (or unsets) the checksum on the fly. > > This would be particularly useful if we switched to enabling checksums by > default, as [0] proposes, but it's also useful without that. Given enabling checksums can be rather expensive, I think it makes sense to add a way to do it during pg_upgrade versus asking folks to run pg_checksums separately. I'd anticipate arguments against enabling checksums automatically, but as you noted, we can move it to a separate option (e.g., --copy --enable-checksums). Disabling checksums with pg_checksums is fast because it just updates pg_control, so I don't see any need for --disable-checkums in pg_upgrade. > - Windows has a separate code path in the --copy mode. I don't know the > reasons or advantages of that. So it's not clear how the checksum rewriting > mode should be handled in that case. We could switch to the non-Windows > code path in that case, but then the performance difference between the > normal path and the checksum-rewriting path is even more unclear. AFAICT the separate Windows path dates back to before pg_upgrade was first added to the Postgres tree, and unfortunately I couldn't find any discussion about it. -- nathan
On Tue, Aug 27, 2024 at 5:57 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Mon, Aug 26, 2024 at 08:23:44AM +0200, Peter Eisentraut wrote: > > The purpose of this patch is to allow using pg_upgrade between clusters that > > have different checksum settings. When upgrading between instances with > > different checksum settings, the --copy (default) mode automatically sets > > (or unsets) the checksum on the fly. > > > > This would be particularly useful if we switched to enabling checksums by > > default, as [0] proposes, but it's also useful without that. > > Given enabling checksums can be rather expensive, I think it makes sense to > add a way to do it during pg_upgrade versus asking folks to run > pg_checksums separately. I'd anticipate arguments against enabling > checksums automatically, but as you noted, we can move it to a separate > option (e.g., --copy --enable-checksums). Disabling checksums with > pg_checksums is fast because it just updates pg_control, so I don't see any > need for --disable-checkums in pg_upgrade. > The repercussions of either enabling or disabling checksums on accident is quite high (not for pg_upgrade, but for $futureDBA), so ISTM an explicit flag for BOTH is the right way to go. In that scenario, pg_upgrade would check to make sure the clusters match and then make the appropriate suggestion. In the case someone did something like --enable-checksums and --link, again, we'd toss an error that --copy mode is required to --enable-checksums. Robert Treat https://xzilla.net
On 21.02.25 00:41, Robert Treat wrote: > On Tue, Aug 27, 2024 at 5:57 PM Nathan Bossart <nathandbossart@gmail.com> wrote: >> >> On Mon, Aug 26, 2024 at 08:23:44AM +0200, Peter Eisentraut wrote: >>> The purpose of this patch is to allow using pg_upgrade between clusters that >>> have different checksum settings. When upgrading between instances with >>> different checksum settings, the --copy (default) mode automatically sets >>> (or unsets) the checksum on the fly. >>> >>> This would be particularly useful if we switched to enabling checksums by >>> default, as [0] proposes, but it's also useful without that. >> >> Given enabling checksums can be rather expensive, I think it makes sense to >> add a way to do it during pg_upgrade versus asking folks to run >> pg_checksums separately. I'd anticipate arguments against enabling >> checksums automatically, but as you noted, we can move it to a separate >> option (e.g., --copy --enable-checksums). Disabling checksums with >> pg_checksums is fast because it just updates pg_control, so I don't see any >> need for --disable-checkums in pg_upgrade. >> > > The repercussions of either enabling or disabling checksums on > accident is quite high (not for pg_upgrade, but for $futureDBA), so > ISTM an explicit flag for BOTH is the right way to go. In that > scenario, pg_upgrade would check to make sure the clusters match and > then make the appropriate suggestion. In the case someone did > something like --enable-checksums and --link, again, we'd toss an > error that --copy mode is required to --enable-checksums. Here is an updated patch that works more along those lines. It adds a pg_upgrade option --update-checksums, which activates the code to rewrite the checksums. You must specify this option if the source and target clusters have different checksum settings. Note that this also works to hypothetically upgrade between future different checksum versions (hence "--update-*", not "--enable-*"). Also, as the patch is currently written, it is also required to specify this option to downgrade from checksums to no-checksums. (It will then write a zero into the checksum place, as it would look if you had never used checksums.) Also, you can optionally specify this option even if the checksum settings are the same, then it will recalculate the checksums. Probably not all of this is useful, but perhaps a subset of it. Thoughts? Also, I still don't know what to do about the Windows code path in copyFile(). We could just not support this feature on Windows? Or maybe the notionally correct thing to do would be to move that code into copyFileByRange(). But then we'd need a different default on Windows and it would require more documentation. I don't know what to do here and I don't have enough context to make a suggestion. But if we don't answer this, I don't think we can move ahead with this feature.
Attachment
On 11.03.25 11:42, Peter Eisentraut wrote: > Here is an updated patch that works more along those lines. It adds a > pg_upgrade option --update-checksums, which activates the code to > rewrite the checksums. You must specify this option if the source and > target clusters have different checksum settings. > > Note that this also works to hypothetically upgrade between future > different checksum versions (hence "--update-*", not "--enable-*"). > Also, as the patch is currently written, it is also required to specify > this option to downgrade from checksums to no-checksums. (It will then > write a zero into the checksum place, as it would look if you had never > used checksums.) Also, you can optionally specify this option even if > the checksum settings are the same, then it will recalculate the > checksums. Probably not all of this is useful, but perhaps a subset of > it. Thoughts? > > Also, I still don't know what to do about the Windows code path in > copyFile(). We could just not support this feature on Windows? Or > maybe the notionally correct thing to do would be to move that code into > copyFileByRange(). But then we'd need a different default on Windows > and it would require more documentation. I don't know what to do here > and I don't have enough context to make a suggestion. But if we don't > answer this, I don't think we can move ahead with this feature. I'm not sensing much enthusiasm for this feature or for working out the remaining problems, so I'm closing this commitfest entry.
On Thu, Apr 3, 2025 at 3:25 AM Peter Eisentraut <peter@eisentraut.org> wrote: > > On 11.03.25 11:42, Peter Eisentraut wrote: > > Here is an updated patch that works more along those lines. It adds a > > pg_upgrade option --update-checksums, which activates the code to > > rewrite the checksums. You must specify this option if the source and > > target clusters have different checksum settings. > > > > Note that this also works to hypothetically upgrade between future > > different checksum versions (hence "--update-*", not "--enable-*"). > > Also, as the patch is currently written, it is also required to specify > > this option to downgrade from checksums to no-checksums. (It will then > > write a zero into the checksum place, as it would look if you had never > > used checksums.) Also, you can optionally specify this option even if > > the checksum settings are the same, then it will recalculate the > > checksums. Probably not all of this is useful, but perhaps a subset of > > it. Thoughts? > > > > Also, I still don't know what to do about the Windows code path in > > copyFile(). We could just not support this feature on Windows? Or > > maybe the notionally correct thing to do would be to move that code into > > copyFileByRange(). But then we'd need a different default on Windows > > and it would require more documentation. I don't know what to do here > > and I don't have enough context to make a suggestion. But if we don't > > answer this, I don't think we can move ahead with this feature. > > I'm not sensing much enthusiasm for this feature or for working out the > remaining problems, so I'm closing this commitfest entry. > That's unfortunate; I think there is enthusiasm for the feature, just not enough to overcome the questions around Windows support. Robert Treat https://xzilla.net