Re: pgsql: Allow on-line enabling and disabling of data checksums - Mailing list pgsql-committers

From Magnus Hagander
Subject Re: pgsql: Allow on-line enabling and disabling of data checksums
Date
Msg-id CABUevEz=tGFWuBnW9bdRnXdfRUnuRMkjqq16fpv2imvi3ejcSQ@mail.gmail.com
Whole thread Raw
In response to Re: pgsql: Allow on-line enabling and disabling of data checksums  (Andrew Dunstan <andrew.dunstan@2ndquadrant.com>)
List pgsql-committers


On Fri, Apr 6, 2018 at 12:41 PM, Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote:
On Fri, Apr 6, 2018 at 7:07 PM, Magnus Hagander <magnus@hagander.net> wrote:
>
>
> On Fri, Apr 6, 2018 at 2:03 AM, Andrew Dunstan
> <andrew.dunstan@2ndquadrant.com> wrote:
>>
>> On Fri, Apr 6, 2018 at 5:35 AM, Magnus Hagander <magnus@hagander.net>
>> wrote:
>> > Allow on-line enabling and disabling of data checksums
>> >
>> > This makes it possible to turn checksums on in a live cluster, without
>> > the previous need for dump/reload or logical replication (and to turn it
>> > off).
>> >
>> > Enabling checkusm starts a background process in the form of a
>> > launcher/worker combination that goes through the entire database and
>> > recalculates checksums on each and every page. Only when all pages have
>> > been checksummed are they fully enabled in the cluster. Any failure of
>> > the process will revert to checksums off and the process has to be
>> > started.
>> >
>> > This adds a new WAL record that indicates the state of checksums, so
>> > the process works across replicated clusters.
>> >
>>
>>
>> This has broken the buildfarm's cross-version upgrade testing (yes, we
>> do it for same-version upgrade as well as previous version upgrade).
>>
>> For now I have fixed crake by adding code to disable checksums in the
>> saved cluster. That at least will send crake green. Not sure if it's
>> the fix we want, though. Maybe we should test if checksums are enabled
>> on the upgraded cluster and if so enable them on the new cluster via
>> initdb. When we decide on the best fix I will put out a new release.
>
>
> I'm unsure of why it actually leaves the cluster with checksums on. Which
> steps leaves it with checksums on? The last step of the checksum specific
> tests actually turns them *off* again. At which point in the series does it
> actually get the cluster to upgrade?


At the time the "old" datadir is copied to be upgraded by this module,
the following test sets have been run against it on crake:

                                          'InstallCheck-C',
                                          'RedisFDW-installcheck-C'
                                          'FileTextArrayFDW-installcheck-C'
                                          'IsolationCheck',
                                          'PLCheck-C',
                                          'ContribCheck-C',
                                          'TestModulesCheck-C',


TestModulesCheck-C runs "make check" in src/test, right?

Can I actually see the output from that somehow? The buildfarm link seems to only show TestModulesInstallCheck-C. And that one doesn't seem to run the checksum checks at all. From the logs I can't even figure out where they run at all, except that the *isolation checker* runs them -- that seems wrong.

--

pgsql-committers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: pgsql: Allow on-line enabling and disabling of data checksums
Next
From: Peter Eisentraut
Date:
Subject: pgsql: Fix compiler warning about format truncation