On 25/11/2020 15:20, Daniel Gustafsson wrote:
>> On 23 Nov 2020, at 18:36, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> What happens is if you crash between UpdateControlFile() and XlogChecksum()?
>
> Good point, that would not get the cluster to a consistent state. The
> XlogChecksum should be performed before controlfile is udpated.
>
> +void
> +SetDataChecksumsOff(void)
> +{
> + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> +
> + if (ControlFile->data_checksum_version == 0)
> + {
> + LWLockRelease(ControlFileLock);
> + return;
> + }
> +
> + if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
> + {
> + ControlFile->data_checksum_version = PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION;
> + UpdateControlFile();
> + LWLockRelease(ControlFileLock);
> +
> + /*
> + * Update local state in all backends to ensure that any backend in
> + * "on" state is changed to "inprogress-off".
> + */
> + XlogChecksums(PG_DATA_CHECKSUM_INPROGRESS_OFF_VERSION);
> + WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_INPROGRESS_OFF));
> +
> + /*
> + * At this point we know that no backends are verifying data checksums
> + * during reading. Next, we can safely move to state "off" to also
> + * stop writing checksums.
> + */
> + }
> +
> + XlogChecksums(0);
> +
> + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> + ControlFile->data_checksum_version = 0;
> + UpdateControlFile();
> + LWLockRelease(ControlFileLock);
> +
> + WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_CHECKSUM_OFF));
> +}
The lwlocking doesn't look right here. If
ControlFile->data_checksum_version != PG_DATA_CHECKSUM_VERSION,
LWLockAcquire is called twice without a LWLockRelease in between.
What if a checkpoint, and a crash, happens just after the WAL record has
been written, but before the control file is updated? That's a
ridiculously tight window for a whole checkpoint cycle to happen, but in
principle I think that would spell trouble. I think you could set
delayChkpt to prevent the checkpoint from happening in that window,
similar to how we avoid this problem with the clog updates at commit.
Also, I think this should be in a critical section; we don't want the
process to error out in between for any reason, and if it does happen,
it's panic time.
- Heikki