Re: [PATCH] Verify Checksums during Basebackups - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: [PATCH] Verify Checksums during Basebackups
Date
Msg-id 20180305115328.GX2416@tamriel.snowman.net
Whole thread Raw
In response to Re: [PATCH] Verify Checksums during Basebackups  (Michael Banck <michael.banck@credativ.de>)
List pgsql-hackers
Michael,

* Michael Banck (michael.banck@credativ.de) wrote:
> Am Montag, den 05.03.2018, 06:36 -0500 schrieb Stephen Frost:
> > * Michael Banck (michael.banck@credativ.de) wrote:
> > > On Sun, Mar 04, 2018 at 06:19:00PM +0100, Magnus Hagander wrote:
> > > > So sure, if we go with WARNING + exit with an errorcode, that is perhaps
> > > > the best combination of the two.
> > >
> > > I had a look at how to go about this, but it appears to be a bit
> > > complicated; the first problem is that sendFile() and sendDir() don't
> > > have status return codes that could be set on checksum verifcation
> > > failure. So I added a global variable and threw an ereport(ERROR) at the
> > > end of perform_base_backup(), but then I realized that `pg_basebackup'
> > > the client program purges the datadir it created if it gets an error:
> > >
> > > > pg_basebackup: final receive failed: ERROR:  Checksum mismatch during
> > > > basebackup
> > > >
> > > > pg_basebackup: removing data directory "data2"
> >
> > Oh, ugh.
>
> I came up with the attached patch, which sets a checksum_failure
> variable in both basebackup.c and pg_basebackup.c, and emits an ereport
> with (for now) ERRCODE_DATA_CORRUPTED at the end of
> perform_base_backup(), which gets caught in pg_basebackup and then used
> to not cleanup the datadir, but exit with a non-zero exit code.
>
> Does that seem feasible?

Ah, yes, I had thought about using a WARNING or NOTICE or similar also
to pass back the info about the checksum failure during the backup, that
seems like it would work as long as pg_basebackup captures that
information and puts it into a log or on stdout or similar.

I'm a bit on the fence about if we shouldn't just have pg_basebackup
always return a non-zero exit code on a WARNING being seen during the
backup instead.  Given that there's a pretty clear SQL code for this
case, perhaps throwing an ERROR and then checking the SQL code isn't
an issue though.

Thanks!

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Michael Banck
Date:
Subject: Re: [PATCH] Verify Checksums during Basebackups
Next
From: Fabien COELHO
Date:
Subject: Re: pgbench randomness initialization