Re: Offline enabling/disabling of data checksums - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Offline enabling/disabling of data checksums
Date
Msg-id 20190109013756.GE21835@paquier.xyz
Whole thread Raw
In response to Re: Offline enabling/disabling of data checksums  (Michael Banck <michael.banck@credativ.de>)
List pgsql-hackers
On Tue, Jan 08, 2019 at 01:03:25PM +0100, Michael Banck wrote:
> I changed that to the switches -c/--verify (-c for check as -v is taken,
> should it be --check as well? I personally like verify better), 
> -d/--disable and -e/--enable.

Indeed we could use --check, pg_checksums --check looks repetitive
still that makes the short option more consistent with the rest.

+   printf(_("  -A, --action   action to take on the cluster, can be set as\n"));
+   printf(_("                 \"verify\", \"enable\" and \"disable\"\n"));
Not reflected yet in the --help portion.

>> Also, the full page is rewritten... would it make sense to only overwrite
>> the checksum part itself?
>
> So just writing the page header? I find that a bit scary and don't
> expect much speedup as the OS would write the whole block anyway I
> guess? I haven't touched that yet.

The OS would write blocks of 4kB out of the 8kB as that's the usual
page size, no?  So this could save a lot of I/O.

> I have mostly taken the pg_rewind code here; if there was a function
> that allowed for safe offline changes of the control file, I'd be happy
> to use it but I don't think it should be this patch to invent that.
>
> In any case, I have removed the unlink() now (not sure where that came
> from), and changed it to open(O_WRONLY) same as in Michael's code and
> pg_rewind.

My own stuff in pg_checksums.c does not have an unlink(), anyway...  I
think that there is room for improvement for both pg_rewind and
pg_checksums here.  What about refactoring updateControlFile() and
move it to controldata_utils.c()?  This centralizes the CRC check,
static assertions, file open and writes into a single place.  The
backend has a similar flavor with UpdateControlFile.  By combining
both we need some extra "ifdef FRONTEND" for BasicOpenFile and the
wait events which generates some noise, still both share a lot.  The
backend also includes a fsync() for the control file which happens
when the file is written, but for pg_checksums and pg_rewind we just
do it in one go at the end, so we would need an extra flag to decide
if fsync should happen or not.  pg_rewind has partially the right
interface by passing ControlFileData contents as an argument.

> V2 attached.

+/* Filename components */
+#define PG_TEMP_FILES_DIR "pgsql_tmp"
+#define PG_TEMP_FILE_PREFIX "pgsql_tmp"
This may look strange, but these are needed because pg_checksums
calls some of the sync-related routines which are defined in fd.c.
Amen.

+   if (fsync(fd) != 0)
+   {
+       fprintf(stderr, _("%s: fsync error: %s\n"), progname, strerror(errno));
+       exit(1);
+   }
No need for that as fsync_pgdata() gets called at the end.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: A few new options for vacuumdb
Next
From: Noah Misch
Date:
Subject: Re: emacs configuration for new perltidy settings