Re: Offline enabling/disabling of data checksums - Mailing list pgsql-hackers
From | Fabien COELHO |
---|---|
Subject | Re: Offline enabling/disabling of data checksums |
Date | |
Msg-id | alpine.DEB.2.21.1903130805330.4059@lancre Whole thread Raw |
In response to | Re: Offline enabling/disabling of data checksums (Michael Paquier <michael@paquier.xyz>) |
Responses |
Re: Offline enabling/disabling of data checksums
|
List | pgsql-hackers |
Michaël-san, > Now the set of patches is: > - 0001, add --enable and --disable. I have tweaked a bit the patch so > as "action" is replaced by "mode" which is more consistent with other > tools like pg_ctl. pg_indent was also complaining about one of the > new enum structures. Patch applies cleanly, compiles, various make check ok, doc build ok. I'm still at odds with the exit(1) behavior when there is nothing to do. If this behavior is kept, I think that the documentation needs to be improved because "failed" does not describe a no-op-was-needed to me. """ If enabling or disabling checksums, the exit status is nonzero if the operation failed. """ Maybe: "... if the operation failed or the requested setting is already active." would at least describe clearly the implemented behavior. + printf(_(" -c, --check check data checksums\n")); + printf(_(" This is the default mode if nothing is specified.\n")); I'm not sure of the punctuation logic on the help line: the first sentence does not end with a ".". I could not find an instance of this style in other help on pg commands. I'd suggest "check data checksums (default)" would work around and be more in line with other commands help. I see a significant locking issue, which I discussed on other threads without convincing anyone. I could do the following things: I slowed down pg_checksums by adding a 0.1s sleep when scanning a new file, then started a "pg_checksums --enable" on a stopped cluster, then started the cluster while the enabling was in progress, then connected and updated data. Hmmm. Then I stopped while the slow enabling was still in progress. Then I could also run a fast pg_checksums --enable in parallel, overtaking the first one... then when the fast one finished, I started the cluster again. When the slow one finished, it overwrote the control file, I had a running cluster with a control file which did not say so, so I could disable the checksum. Hmmm again. Ok, I could not generate a inconsistent state because on stopping the cluster the cluster file is overwritten with the initial state from the point of view of postmater, but it does not look good. I do not think it is a good thing that two commands can write to the data directory at the same time, really. About fsync-ing: ISTM that it is possible that the control file is written to disk while data are still not written, so a failure in between would leave the cluster with an inconsistent state. I think that it should fsync the data *then* update the control file and fsync again on that one. > - 0002, add --no-sync. Patch applies cleanly, compiles, various make checks are ok, doc build ok. Fine with me. -- Fabien.
pgsql-hackers by date: