Re: checksum verification code breaks backups in v16- - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: checksum verification code breaks backups in v16-
Date
Msg-id Z0-FnG0zQ9xlciQP@nathan
Whole thread Raw
In response to checksum verification code breaks backups in v16-  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: checksum verification code breaks backups in v16-
List pgsql-hackers
On Tue, Dec 03, 2024 at 11:39:43AM -0500, Robert Haas wrote:
> If we want a narrower and thus less-risky fix, we could consider just
> adjusting this code here:
> 
>                 segmentno = atoi(segmentpath + 1);
>                 if (segmentno == 0)
>                     ereport(ERROR,
>                             (errmsg("invalid segment number %d in file \"%s\"",
>                                     segmentno, filename)));

I'm only seeing this code in pg_checksums.  Is that what you are proposing
to change?

> I think we could just replace the ereport() with verify_checksum =
> false (and update the comments). That would leave open the possibility
> of trying to checksum some random file with a name like
> this_file_should_not_be_here, which we'll interpret as segment 0
> because the name does not contain a dot; but if the file were called
> this.file.should.not.be.here, we'd not try to checksum it at all
> because "here" is not an integer. That's a little random, but it
> avoids taking a position on whether
> 025584a168a4b3002e19350bb8db0ebf1fd10235 is fully correct. It simply
> takes the narrower position that we shouldn't fail the entire backup
> because we're unable to perform a checksum calculation that in the
> worst case would only produce a warning anyway.

Hm.

> At the moment, I think I slightly prefer the narrower fix, but it
> might be a little too narrow, since the resultant behavior as
> described in the preceding paragraph is somewhat wonky and
> nonsensical.

Yeah, I'm a little worried that it is too narrow.  I was hoping that the
refactoring commit was older, because then I think we'd be much more
willing to back-patch it.  But there are a couple of v17 releases out there
already, so maybe it's not too much of a stretch.  I guess I'm leaning
towards the back-patching idea...

-- 
nathan



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Doc: typo in config.sgml
Next
From: Thomas Munro
Date:
Subject: Re: Windows pg_basebackup unable to create >2GB pg_wal.tar tarballs ("could not close file: Invalid argument" when creating pg_wal.tar of size ~ 2^31 bytes)