Re: Online enabling of checksums - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Online enabling of checksums |
Date | |
Msg-id | 6f8675d8-c4db-1fae-6de4-4a0857421c9b@iki.fi Whole thread Raw |
In response to | Re: Online enabling of checksums (Daniel Gustafsson <daniel@yesql.se>) |
Responses |
Re: Online enabling of checksums
Re: Online enabling of checksums |
List | pgsql-hackers |
Hi, The patch looks good to me at a high level. Some notes below. I didn't read through the whole thread, so sorry if some of these have been discussed already. > +void > +ShutdownChecksumHelperIfRunning(void) > +{ > + if (pg_atomic_unlocked_test_flag(&ChecksumHelperShmem->launcher_started)) > + { > + /* Launcher not started, so nothing to shut down */ > + return; > + } > + > + ereport(ERROR, > + (errmsg("checksum helper is currently running, cannot disable checksums"), > + errhint("Restart the cluster or wait for the worker to finish."))); > +} Is there no way to stop the checksum helper once it's started? That seems rather user-unfriendly. I can imagine it being a pretty common mistake to call pg_enable_data_checksums() on a 10 TB cluster, only to realize that you forgot to set the cost limit, and that it's hurting queries too much. At that point, you want to abort. > + * This is intended to create the worklist for the workers to go through, and > + * as we are only concerned with already existing databases we need to ever > + * rebuild this list, which simplifies the coding. I can't parse this sentence. > +extern bool DataChecksumsEnabledOrInProgress(void); > +extern bool DataChecksumsInProgress(void); > +extern bool DataChecksumsDisabled(void); I find the name of the DataChecksumsEnabledOrInProgress() function a bit long. And doing this in PageIsVerified looks a bit weird: > if (DataChecksumsEnabledOrInProgress() && !DataChecksumsInProgress()) I think I'd prefer functions like: /* checksums should be computed on write? */ bool DataChecksumNeedWrite() /* checksum should be verified on read? */ bool DataChecksumNeedVerify() > + <literal>template0</literal> is by default not accepting connections, to > + enable checksums you'll need to temporarily make it accept connections. This was already discussed, and I agree with the other comments that it's bad. > +CREATE OR REPLACE FUNCTION pg_enable_data_checksums ( > + cost_delay int DEFAULT 0, cost_limit int DEFAULT 100) > + RETURNS boolean STRICT VOLATILE LANGUAGE internal AS 'enable_data_checksums' > + PARALLEL RESTRICTED; pg_[enable|disable]_checksums() functions return a bool. It's not clear to me when they would return what. I'd suggest marking them as 'void' instead. > --- a/src/include/storage/bufpage.h > +++ b/src/include/storage/bufpage.h > @@ -194,6 +194,7 @@ typedef PageHeaderData *PageHeader; > */ > #define PG_PAGE_LAYOUT_VERSION 4 > #define PG_DATA_CHECKSUM_VERSION 1 > +#define PG_DATA_CHECKSUM_INPROGRESS_VERSION 2 > This seems like a weird place for these PG_DATA_CHECKSUM_* constants. They're not actually stored in the page header, as you might think. I think the idea was to keep PG_DATA_CHECKSUM_VERSION close to PG_PAGE_LAYOUT_VERSION, because the checksums affected the on-disk format. I think it was a bit weird even before this patch, but now it's worse. At least a better comment would be in order, or maybe move these elsewhere. Feature request: it'd be nice to update the 'ps status' with set_ps_display, to show a rudimentary progress indicator. Like the name and block number of the relation being processed. It won't tell you how much is left to go, but at least it will give a warm fuzzy feeling to the DBA that something is happening. I didn't review the offline tool, pg_verify_checksums. - Heikki
pgsql-hackers by date: