Re: Online checksums patch - once again - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Online checksums patch - once again
Date
Msg-id 8e1cda9b-1cb9-a634-d383-f757bf67b820@iki.fi
Whole thread Raw
In response to Re: Online checksums patch - once again  (Daniel Gustafsson <daniel@yesql.se>)
Responses Re: Online checksums patch - once again  (Daniel Gustafsson <daniel@yesql.se>)
List pgsql-hackers
On 25/11/2020 15:20, Daniel Gustafsson wrote:
>> On 23 Nov 2020, at 18:36, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> What happens is if you crash between UpdateControlFile() and XlogChecksum()?
> 
> Good point, that would not get the cluster to a consistent state.  The
> XlogChecksum should be performed before controlfile is udpated.
> 
> +void
> +SetDataChecksumsOff(void)
> +{
> +    LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> +
> +    if (ControlFile->data_checksum_version == 0)
> +    {
> +        LWLockRelease(ControlFileLock);
> +        return;
> +    }
> +
> +    if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
> +    {
> +        ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
> +        UpdateControlFile();
> +        LWLockRelease(ControlFileLock);
> +
> +        /*
> +         * Update local state in all backends to ensure that any backend in
> +         * "on" state is changed to "inprogress-off".
> +         */
> +        XlogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
> +        WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF));
> +
> +        /*
> +         * At this point we know that no backends are verifying data checksums
> +         * during reading. Next, we can safely move to state "off" to also
> +         * stop writing checksums.
> +         */
> +    }
> +
> +    XlogChecksums(0);
> +
> +    LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> +    ControlFile->data_checksum_version = 0;
> +    UpdateControlFile();
> +    LWLockRelease(ControlFileLock);
> +
> +    WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF));
> +}

The lwlocking doesn't look right here. If 
ControlFile->data_checksum_version != PG_DATA_CHECKSUM_VERSION, 
LWLockAcquire is called twice without a LWLockRelease in between.

What if a checkpoint, and a crash, happens just after the WAL record has 
been written, but before the control file is updated? That's a 
ridiculously tight window for a whole checkpoint cycle to happen, but in 
principle I think that would spell trouble. I think you could set 
delayChkpt to prevent the checkpoint from happening in that window, 
similar to how we avoid this problem with the clog updates at commit. 
Also, I think this should be in a critical section; we don't want the 
process to error out in between for any reason, and if it does happen, 
it's panic time.

- Heikki



pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: [PATCH] LWLock self-deadlock detection
Next
From: Greg Nancarrow
Date:
Subject: Re: Parallel plans and "union all" subquery