Re: Using read stream in autoprewarm - Mailing list pgsql-hackers

From Nazir Bilal Yavuz
Subject Re: Using read stream in autoprewarm
Date
Msg-id CAN55FZ0TBBmrJ2vtMQ9rEk-NTL2BWQzavVp=iRLOUskm+zvNNw@mail.gmail.com
Whole thread Raw
In response to Re: Using read stream in autoprewarm  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: Using read stream in autoprewarm
List pgsql-hackers
Hi,

On Wed, 2 Apr 2025 at 18:54, Melanie Plageman <melanieplageman@gmail.com> wrote:
>
> On Wed, Apr 2, 2025 at 6:26 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> >
> > On Wed, 2 Apr 2025 at 01:36, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > >
> > > I know you mentioned off-list that you don't like the handling of
> > > global objects in my version, but I prefer doing it this way (even
> > > though we have to check for in the loop condition) to having to set
> > > the current database once we reach non-shared objects. It feels too
> > > fiddly. This way seems less error prone. Looking at this version, what
> > > do you think? Could we do it better?
> >
> > I think there might be a problem with that approach. Let's say that we
> > are able to open relation when database oid = 0 and filenumber = 18.
> > Then we are trying to find a valid fork now. We couldn't find a valid
> > fork immediately, so we continued looping. Then database oid is
> > changed from 0 to let's say 1 but filenumber remains the same. We are
> > still in the valid fork loop, so relation remains from the database
> > oid = 0. Isn't that wrong?
>
> Yep, you are totally right. The code was wrong. We could fix it by
> setting current_db to the valid database once we've prewarmed the
> global objects, but we need that logic in three places, so that seems
> quite undesirable.
>
> In attached v9, I've added a patch to apw_load_buffers() which invokes
> autoprewarm_database_main() for the global objects alone but while
> connected to the first valid database. It's not the best solution but
> I think it is better than having that fiddly logic everywhere about
> database 0.

I liked this. I think it is better compared to handling global objects
in autoprewarm_database_main().

> This made me think, I wonder if we could connect to template0 or
> template1 to prewarm the global objects. Then we could also prewarm if
> only global objects are present (that doesn't seem very important but
> it would be a side effect). It might be more clear to connect to
> template0/1 instead of the first valid database to prewarm global
> objects. I don't know if there is some reason not to do this -- like
> maybe bg workers aren't allowed or something?

I am not sure but I think the current implementation is good enough.

> > 0001:
> >
> > ReadBufferExtended() can be called in its own minimal loop, otherwise
> > we end up doing unnecessary checks for each ReadBufferExtended() call.
> > This is not a problem when the 0002 is applied.
>
> Could you provide a snippet of example code? If we call
> ReadBufferExtended() in a loop, the block won't be changing, so I
> don't see how that will help.

I don't have an example code right now. But what I mean is we may call
ReadBufferExtended() in a loop for the blocks in the same fork. We
don't need to call smgrexists() and RelationGetNumberOfBlocksInFork()
for each block, we will call these for each fork not for each block.
However, like I said before, this is not important when the read
stream code is applied.

> > 0002:
> >
> > We don't skip blocks whose blocknum is more than nblocks_in_fork. We
> > can add that either to a stream callback like you did before or after
> > the read_stream_end. I prefer stream callback because of the reason
> > below [1].
>
> Yep, I also thought we had to have that logic, but because we sort by
> db,rel,fork,blkno, I think blocks with blocknumber >= nblocks_in_fork
> will be last and so we just want to move on to the next fork.

I agree that they will be the last, but won't we end up creating a
read stream object for each block?

> > > On another topic, what are the minimal places we need to call
> > > have_free_buffers() (in this version)? I haven't even started looking
> > > at the last patch you've been sending that is about checking the
> > > freelist. I'll have to do that next.
> >
> > I think its current places are good enough. We may add one after the
> > read_stream_end if we want to handle blk->blocknum >= nblocks_in_fork
> > after the read stream finishes. If we handle that in the stream
> > callback then no need to add have_free_buffers() [1].
>
> As long as we have it in the callback, I don't think that we need
> have_free_buffers() after read_stream_end() since it is in the while
> loop condition which we will immediately execute after
> read_stream_end().

This is okay for me.

> I was also wondering about the other patch in your earlier set that
> set stop_idx from get_number_of_free_buffers(). Could you tell me more
> about that? What does it do and why is it needed with the read stream
> but wasn't needed before?

In the read stream code, we use callbacks to create bigger I/Os. These
I/Os aren't processed until the io_combine_limit or we hit not
sequential blocknum. In other words, when the have_free_buffer()
function returns false in the callback; there are still queued blocks
in the stream, although there are no free buffers in the buffer pool.
We can end up creating I/Os bigger than free buffers in the shared
buffers.

To solve that a bit, we try to get a number of free buffers in the
shared buffers. So, we try to minimize the problem above by using the
actual free buffer count. That optimization has problems like if other
processes fill shared buffers at the same time while the read stream
is running, then this optimization will not work well.

--
Regards,
Nazir Bilal Yavuz
Microsoft



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: [PATCH] Add sortsupport for range types and btree_gist
Next
From: Jeff Davis
Date:
Subject: Re: statistics import and export: another difference in dump/restore