On Thu, Nov 5, 2020 at 10:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> I still feel 'cached' is a better name.
Amusingly, this thread is hitting all the hardest problems in computer
science according to the well known aphorism...
Here's a devil's advocate position I thought about: It's OK to leave
stray buffers (clean or dirty) in the buffer pool if files are
truncated underneath us by gremlins, as long as your system eventually
crashes before completing a checkpoint. The OID can't be recycled
until after a successful checkpoint, so the stray blocks can't be
confused with the blocks of another relation, and weird errors are
expected on a system that is in serious trouble. It's actually much
worse that we can give incorrect answers to queries when files are
truncated by gremlins (in the window of time before we presumably
crash because of EIO), because we're violating basic ACID principles
in user-visible ways. In this thread, discussion has focused on
availability (ie avoiding failures when trying to write back stray
buffers to a file that has been unlinked), but really a system that
can't see arbitrary committed transactions *shouldn't be available*.
This argument applies whether you think SEEK_END can only give weird
answers in the specific scenario I demonstrated with NFS, or whether
you think it's arbitrarily b0rked and reports random numbers: we
fundamentally can't tolerate that, so why are we trying to?