Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows - Mailing list pgsql-bugs

From Robert Haas
Subject Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows
Date
Msg-id CA+TgmobH07rpdxVnXN6NgUjwK0-K9DpW02LHuE-bx6mFoNHn=Q@mail.gmail.com
Whole thread Raw
In response to Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On Wed, Oct 4, 2023 at 7:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > But as for what we should do about it, PANIC (as suggested by several
> > people) seems better than corruption, if we're not going to write some
> > kind of resilience?
>
> Maybe that's an acceptable answer now ... it's not great, but nobody
> is in love with any of the other options either.  And it would definitely
> get DBAs' attention about this misbehavior of their file systems.

I and others, including Andres, have been thinking that a PANIC is the
right option for some time.

Quoth I in
https://www.postgresql.org/message-id/CA%2BTgmobwc_Rdaw%2B6TupT4_g9z55JjL%3DvhwpphsQe%3DYmBN0OPDg%40mail.gmail.com
some 2 years ago...
> As you say, this doesn't fix the problem that truncation might fail.
> But as Andres and Sawada-san said, the solution to that is to get rid
> of the comments saying that it's OK for truncation to fail and make it
> a PANIC. However, I don't think that change needs to be part of this
> patch. Even if we do that, we still need to do this. And even if we do
> this, we still need to do that.

I think the only reasons that I didn't do it at the time where (a)
shortage of round tuits and (b) fear of being yelled at. But the
comment is wrong, and a critical section is right.

I do think that it's nice to be tolerant of bad filesystem behavior
when we can. For instance if we try to write() some data to the OS and
it fails for some transient reason, it's nice if we can try to write()
it again. But there are always going to be cases where that sort of
tolerance is not practical. Having PostgreSQL continue to operate when
the filesystem isn't operating is a luxury, and we can't afford it in
every situation. shared_buffers provides a layer of insulation between
the logical act of modifying a buffer and the need to have a system
call succeed -- dirtying the buffer is in effect making a note that
the write() needs to be done later, instead of actually doing it in
the moment. And since the code that actually writes it is
checkpoint-aware and write-outs can be retried, we can avoid
panicking. But for operations such as creating, removing, or
truncating relations, there is no similar, general layer of insulation
-- we have no mechanism that allows us to logically do those things
now and have them actually happen at the FS level later. Which, to me,
seems to mean that we have little choice but to panic if they fail.
Otherwise, the primary diverges from any standbys that it has.

I also think that's OK. Unreliable filesystems lead to unreliable
databases, and it's better to find that out before something really
bad happens. Maybe in the future we'll develop more general mechanisms
for some of this stuff and maybe that will allow us to avoid panics in
more cases, and then we can debate the merits of such changes. But
right now, the cost of avoiding a panic here is a corrupted database,
and I have to believe that the overwhelming majority of users would
think that a corrupted database is worse.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #18154: Corrupt DB: pg_dump segfaults or returns empty
Next
From: Robert Haas
Date:
Subject: Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows