Re: Buffer locking is special (hints, checksums, AIO writes) - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Buffer locking is special (hints, checksums, AIO writes)
Date
Msg-id yn4f45z6wl6wstuwha3qjsicrc6uvymkldtqz2ttchz7zdue3y@n6fll5dcter4
Whole thread
In response to Re: Buffer locking is special (hints, checksums, AIO writes)  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: Buffer locking is special (hints, checksums, AIO writes)
List pgsql-hackers
Hi,

On 2026-03-25 17:34:33 -0400, Melanie Plageman wrote:
> On Wed, Mar 11, 2026 at 6:40 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > I pushed this and many of the later patches in the series.  Here are updated
> > versions of the remaining changes.  The last two previously were one commit
> > with "WIP" in the title. The first one has, I think, not had a lot of review -
> > but it's also not a complicated change.
> 
> 0001 looks good except for the comment above PageSetChecksum() that
> says it is only for shared buffers and a stray reference to the
> no-longer-present bufToWrite variable in a comment around line 4490 in
> bufmgr.c

Thanks for catching these.

Updated the PageSetChecksum() comment to

 * Set checksum on a page.
 *
 * If the page is in shared buffers, it needs to be locked in at least
 * share-exclusive mode.
...


> 0002
> diff --git a/src/backend/access/nbtree/nbtpage.c
> b/src/backend/access/nbtree/nbtpage.c
> index cc9c45dc40c..ad700e590e8 100644
> --- a/src/backend/access/nbtree/nbtpage.c
> +++ b/src/backend/access/nbtree/nbtpage.c
> @@ -1011,24 +1011,48 @@ _bt_relandgetbuf(Relation rel, Buffer obuf,
> BlockNumber blkno, int access)
>     Assert(BlockNumberIsValid(blkno));
>     if (BufferIsValid(obuf))
> -       _bt_unlockbuf(rel, obuf);
> -   buf = ReleaseAndReadBuffer(obuf, rel, blkno);
> -   _bt_lockbuf(rel, buf, access);
> +   {
> +       if (BufferGetBlockNumber(obuf) == blkno)
> +       {
> +           /* trade in old lock mode for new lock */
> +           _bt_unlockbuf(rel, obuf);
> +           buf = obuf;
> +       }
> +       else
> +       {
> +           /* release lock and pin at once, that's a bit more efficient */
> +           _bt_relbuf(rel, obuf);
> +           buf = ReadBuffer(rel, blkno);
> +       }
> +   }
> +   else
> +       buf = ReadBuffer(rel, blkno);
> 
> Not related to this patch, but why do we unlock and relock it when
> obuf has the block we need? Couldn't we pass lock mode and then just
> do nothing if it is the right lockmode?

I think it's very unlikely that it's called at any frequency with the same
buffer and lockmode. What would be the point of calling _bt_relandgetbuf() if
that's the case.


> Setting that aside, I presume we don't need to check the fork and
> relfilelocator (as ReleaseAndReadBuffer() did) because this code knows
> it will be the same?

Yea, it's a single index, so there can't be a different relfilenode.


> 0003
> AFAICT, this does what you claim. I don't really know what else to
> look when reviewing it, if I'm being honest. As such, I diligently fed
> it through AI which suggested you may have lost a
>         VALGRIND_MAKE_MEM_NOACCESS(BufHdrGetBlock(buf), BLCKSZ);
> which sounds right to me and like something you should fix.

Good catch Melai.


> Also, I'd say this comment
> +   /*
> +    * Now okay to allow cancel/die interrupts again, were held when the lock
> +    * was acquired.
> +    */
> 
> needs a "which" after the comma to read smoothly.

Fixed.


Running it through valgrind and then will work on reading through one more
time and pushing them.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: SATYANARAYANA NARLAPURAM
Date:
Subject: Re: LockHasWaiters() crashes on fast-path locks
Next
From: Matthias van de Meent
Date:
Subject: Re: SQL-level pg_datum_image_equal