Re: Offline enabling/disabling of data checksums - Mailing list pgsql-hackers

From Michael Banck
Subject Re: Offline enabling/disabling of data checksums
Date
Msg-id 1553011697.9697.55.camel@credativ.de
Whole thread Raw
In response to Re: Offline enabling/disabling of data checksums  (Andres Freund <andres@anarazel.de>)
Responses Re: Offline enabling/disabling of data checksums
List pgsql-hackers
Hi,

Am Dienstag, den 19.03.2019, 09:00 -0700 schrieb Andres Freund:
> On 2019-03-19 16:55:12 +0100, Michael Banck wrote:
> > Am Dienstag, den 19.03.2019, 08:36 -0700 schrieb Andres Freund:
> > > On 2019-03-18 17:13:01 +0900, Michael Paquier wrote:
> > > > +/*
> > > > + * Locations of persistent and temporary control files.  The control
> > > > + * file gets renamed into a temporary location when enabling checksums
> > > > + * to prevent a parallel startup of Postgres.
> > > > + */
> > > > +#define CONTROL_FILE_PATH        "global/pg_control"
> > > > +#define CONTROL_FILE_PATH_TEMP    CONTROL_FILE_PATH ".pg_checksums_in_progress"
> > > 
> > > I think this should be outright rejected. Again, you're making the
> > > control file into something it isn't. And there's no buyin for this as
> > > far as I can tell outside of Fabien and you. For crying out loud, if the
> > > server crashes during this YOU'VE CORRUPTED THE CLUSTER.
> > 
> > The cluster is supposed to be offline during this.  This is just an
> > additional precaution so that nobody starts it during the operation -
> > similar to how pg_upgrade disables the old data directory.
> 
> I don't see how that matters. Afterwards the cluster needs low level
> surgery to be recovered. That's a) undocumented b) likely to be done
> wrongly.  This is completely unacceptable *AND UNNECESSARY*.

Can you explain why low level surgery is needed and how that would look
like?

If pg_checksums successfully enables checksums, it will move back the
control file and update the checksum version - the cluster is ready to
be started again unless I am missing something?

If pg_checksums is interrupted by the admin, it will move back the
control file and the cluster is ready to be started again as well.

If pg_checksums aborts with a failure, the admin will have to move back
the control file before starting up the instance again, but I don't
think that counts?

If pg_checksums crashes due to I/O failures or other causes I can see
how possibly the block it was currently writing might need low level
surgery, but in that case we are in the domain of forensics already I
guess and that still does not pertain to the control file?

Sorry for being obtuse, I don't get it.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Offline enabling/disabling of data checksums
Next
From: Andres Freund
Date:
Subject: Re: Offline enabling/disabling of data checksums