Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows - Mailing list pgsql-bugs

From Thomas Munro
Subject Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows
Date
Msg-id CA+hUKGJy9iCBfkjUyV8ZuRwd5CAGxZV1STywe+0S+9YKH1zF8w@mail.gmail.com
Whole thread Raw
In response to Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows  (Michael Paquier <michael@paquier.xyz>)
Responses Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-bugs
On Thu, Oct 5, 2023 at 11:44 AM Michael Paquier <michael@paquier.xyz> wrote:
> On Thu, Oct 05, 2023 at 10:12:27AM +1300, Thomas Munro wrote:
> > But as for what we should do about it, PANIC (as suggested by several
> > people) seems better than corruption, if we're not going to write some
> > kind of resilience?  How else are we supposed to deal with "this
> > shouldn't happen, and if it does we're hosed?"
>
> A PANIC may be OK for this specific syscall and would be better, but
> the problematic area is larger than that as we'd still finish with a
> corruption as long as there's an ERROR or a FATAL between the moment
> the buffers (potentially dirty, with live-still-dead-in-memory tuples
> on disk) are discarded and the moment the truncation fails.  Another
> method discussed is the use of a critical section (I recall that there
> were some pallocs in this area, actually, but got nothing on my notes
> about that...).

Yeah.  I guess the obvious place for a critical section to start would
be in RelationTruncate() near DELAY_CHKPT_COMPLETE where a similar
concern about recovery is discussed.  There is a comment explaining
that it's a bad idea to use a critical section, but evidently it
overestimated the harmlessness of that choice.  It has a point that if
DO can't truncate, maybe REDO will fail too and you might be stuck in
an eternal samsara of failed recovery, but... that's because your
system isn't doing things we fundamentally need it to do to make
progress, and an administrator needs to find out why.  And although
eternity sounds bad, as far as I can tell from the rare reports we've
had of this failure, it seems to be transient, right?

Perhaps we could consider adding ftruncate() to the set of horrible
Windows wrapper functions that hide a few sleep-retry loops before
they give up so there's a good chance of avoiding a crash, but it'd be
better if someone more knowledgeable/hands-on with Windows could get
to the bottom of what is causing this and document how to avoid it...
I think we understand why we need that in the other cases (this OS's
peculiar dirent management), and this case seems a bit different.

As for Unix, if I guessed right about mmap being involved, the kernel
always lets the truncation proceed but kills the mapping process on
access to non-backed region of memory.  And for EINTR, handling was
recently added for this (0d369ac6500), so I guess there may be ways in
older branches on some systems (eg the old 'interruptible' NFS, but
probably not on normal file systems...).  The only other things I can
think of are EIO, and then various left field errors caused by the
file system going read only or the file being 'sealed', etc, and
promotion to PANIC seems appropriate for those cases.



pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #18147: ERROR: invalid perminfoindex 0 in RTE with relid xxxxx
Next
From: Thomas Munro
Date:
Subject: Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows