Re: Assert in heapgettup_pagemode() fails due to underlying buffer change - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Assert in heapgettup_pagemode() fails due to underlying buffer change
Date
Msg-id CA+hUKGK9KfChzZxgtXp=6iwzjRNHbMqS3Hje+jM6__z2XAFR7A@mail.gmail.com
Whole thread Raw
In response to Re: Assert in heapgettup_pagemode() fails due to underlying buffer change  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Assert in heapgettup_pagemode() fails due to underlying buffer change
List pgsql-hackers
On Sat, Jun 8, 2024 at 12:47 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Fri, Jun 7, 2024 at 4:05 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > >  static void
> > > -ZeroBuffer(Buffer buffer, ReadBufferMode mode)
> > > +ZeroBuffer(Buffer buffer, ReadBufferMode mode, bool zero)
> >
> > This change makes the API very strange.  Should the function be called
> > ZeroAndLockBuffer() instead?  Then the addition of a "bool zero"
> > argument makes a lot more sense.
>
> I agree that's better, but it still looks a bit weird. You have to
> realize that 'bool zero' means 'is already zeroed' here -- or at
> least, I guess that's the intention. But then I wonder why you'd call
> a function called ZeroAndLockBuffer if all you need to do is
> LockBuffer.

The name weirdness comes directly from RBM_ZERO_AND_LOCK (the fact
that it doesn't always zero despite shouting ZERO is probably what
temporarily confused me).  But coming up with a better name is hard
and I certainly don't propose to change it now.  I think it's
reasonable for this internal helper function to have that matching
name as Alvaro suggested, with a good comment about that.

Even though that quick-demonstration change fixed the two reported
repros, I think it is still probably racy (or if it isn't, it relies
on higher level interlocking that I don't want to rely on).  This case
really should be using the standard StartBufferIO/TerminateBufferIO
infrastructure as it was before.  I had moved that around to deal with
multi-block I/O, but dropped the ball on the zero case... sorry.

Here's a version like that.  The "zero" argument (yeah that was not a
good name) is now inverted and called "already_valid", but it's only a
sort of optimisation for the case where we already know for sure that
it's valid.  If it isn't, we do the standard
BM_IO_IN_PROGRESS/BM_VALID/CV dance, for correct interaction with any
concurrent read or zero operation.

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: PgStat_KindInfo.named_on_disk not required in shared stats
Next
From: Alexander Lakhin
Date:
Subject: Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed