Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows
Date
Msg-id ZjHMcwiltqYXsKYb@paquier.xyz
Whole thread Raw
In response to Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-bugs
On Tue, Apr 23, 2024 at 07:48:12PM +1200, Thomas Munro wrote:
> Here is a new attempt to see what it might take to put
> RelationTruncate() into a critical section.  Problems encountered:
> even if you've called mdnblocks() beforehand, dressed up as
> smgrpreparetruncate(), if the highest segment is exactly full then the
> later mdnblocks() again probes whether the next segment number exists
> on disk, which involves GetRelationPath(), which allocates.  So I
> finished up having to write a GetRelationPathInPlace() function, and
> to decide whether it's OK to use a MAXPGPATH-sized array on the stack
> for this.  I also had to teach _fdvec_resize() not to reallocate when
> downsizing, to avoid the critical section assertion.  It seems like
> quite a lot to back-patch... but also awful to leave this trickle of
> data corruption reports unaddressed.  The big re-engineering ideas[1]
> would be absolutely unbackpatchable, but I hope we can work on
> something like that for 18...

For now I'd agree with something like that, even for a backpatch.  No
idea what a good "cheap" solution would look like, though.  Anything
considered these past years for this issue was utterly expensive.

> However, it didn't guarantee that the sync request for the
> truncation got processed before the XLOG_CHECKPOINT_ONLINE record
> was written. By setting XLOG_CHKPT_START, we guarantee that if
> an XLOG_SMGR_TRUNCATE record is written to WAL before the redo
> pointer of some concurrent checkpoint, the sync request queued by
> that operation must be processed by that checkpoint (rather than
> being left for the following one).

s/XLOG_CHKPT_START/DELAY_CHKPT_START/ in the commit message of 0001.

The point you are making about d8725104 seems like a good argument to
do a backpatch down to v14, as the possible interrupt make this setof
issues much easier to reach compared to the Windows-only-weird-syscall
failures happening in the middle of the truncation that people have
been complaining about for N years now.  I am seeing less reports of
these on WIN32 these days with files switched suddenly to read-only,
or is that just my imagination?

The pieces around GetRelationPathInPlace() and MAX_EXTENSION_SIZE
could be a refactoring piece worth their own, split into its own patch
before introducing the core part of the fix with the critical section.
That would be much cleaner, IMO.

It seems to me that an assertion at the end of
GetRelationPathInPlace() about the computed size would be adapted
based on MAXPGPATH.  The interface of _mdfd_segpath() is quite
confusing, actually.  Why return the same pointer as the input
argument rather than void?

I was re-reading your comment about last October, and the fact that we
may stuck replay, and finishing with semi-corrupted undetectible
corruptions is worse than that, still I'm going to agree that failing
harder will serve much better at this stage:
https://www.postgresql.org/message-id/CA%2BhUKGJy9iCBfkjUyV8ZuRwd5CAGxZV1STywe%2B0S%2B9YKH1zF8w%40mail.gmail.com

That may impact availability.  Still that's cheaper than any other
solution discussed.  The one involving the WAL-logging of dirty pages
that would get truncated shortly after comes first into mind..

PS: On HEAD, I'd like to see one or more tests for these failure
scenarios in the long-term and how replay is able to react after
panicking in the critical section.  Now, injection points won't work
inside a critical section as the initial function loading needs to do
a palloc() for the library path.  We've discussed about having a
PRELOAD macro that could be called outside the critical path, before
running a INJECTION_POINT.  Just mentioning.
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #18451: NULL fails to coerce to string when performing string comparison
Next
From: Jernej Simončič
Date:
Subject: Re: BUG #18453: --exclude-database has problems with capital letters