Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer - Mailing list pgsql-bugs

From Xuneng Zhou
Subject Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer
Date
Msg-id CABPTF7XMsoEUfQ-jn96O7hqKG5C+_1rBRwisRYyDidicOaz-Bw@mail.gmail.com
Whole thread Raw
In response to Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-bugs
Hi,

On Wed, Aug 13, 2025 at 2:02 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Wed, Aug 13, 2025 at 5:29 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > >  1. The leading block is BM_VALID, so we choose to give it to you
> > > immediately and not look further (we could look further and return
> > > more than one consecutive BM_VALID block at a time, but this isn't
> > > implemented)
> >
> > I am curious about why this isn't implemented. It looks helpful.
> > Is there any blocking issue or trade-off for not doing so?
>
> stream->distance tends to be 1 for well cached data (it grows when IO
> is needed and it is worth looking far ahead, and shrinks when data is
> found in cache).  So we won't currently have a distance > 1 for well
> cached data, so we wouldn't benefit much.  That may not always be
> true: if the buffer mapping table is changed to a tree data structure
> (like the way kernels manage virtual memory pages associated with a
> file or other VM object), then we might have an incentive to look up
> lots of cached blocks at the same time, and then we might want to
> change the distance tuning algorithm, and then we might want
> StartReadBuffers() to be able to return more than one cached block at
> a time.  We've focused mainly on I/O so far, and just tried not to
> make the well cached cases worse.

Understood. I came across Alexander’s article on the topic. It appears
to require a major re-architecture, which could be challenging to land
HEAD.
https://www.orioledb.com/blog/buffer-management

>
> > The format of this part is not aligned well in gmail, so I copy it into vs code.
> > Is this layout right?
>
> Yes.
>
> > I found second illustration somewhat hard to follow, especially
> > the "do nothing" trick and the movement of next_buffef_index in the second queue.
> > Maybe I need to read the corresponding code.
>
> Suppose you have pending_read_blocknum = 100, pending_read_nblock = 5,
> next_buffer_index = 200.  Now you decide to start that read, because
> the next block the caller wants can't be combined, so you call
> StartReadBuffers(blocknum = 100, *nblocks = 4, buffers =
> &buffers[200]).  StartReadBuffers() pins 5 buffers and starts an IO,
> but finds a reason to split it at size 2.  It sets *nblocks to 2, and
> returns.  Now read_stream.c adds 2 to pending_read_blocknum, subtracts
> 2 from pending_read_nblocks, so that it represents the continuation of
> the previous read.  It also adds 2 to next_buffer_index with modulo
> arithmetic (so it wraps around the queue), because that is the
> location of the buffers for the next read, and sets forwarded_buffers
> to 3, because there are 3 buffers already pinned (in that patch ^ it
> is renamed to pending_read_npinned).  The three buffers themselves are
> already in the queue at the right position for StartReadBuffers() to
> receive them and know that it doesn't have to pin them again.  What I
> was referring to with "doing nothing" is the way that if you keep
> sliding StartReadBuffers() call along the queue, the extra buffers it
> spits out become its input next time, without having to be copied
> anywhere.  In that picture there are two forwarded buffers, labeled 9
> and A.


Got it. Elegant technique. Patch 2 also reads as a robustness
improvement now. :-) I'll try to map the high-level design to code and
review patch 2.
Thanks for the great guidance!

Best,
Xuneng



pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #19019: Feature Request: allow the use of column reference in DEFAULT expression
Next
From: "David G. Johnston"
Date:
Subject: Re: BUG #19019: Feature Request: allow the use of column reference in DEFAULT expression