Home > mailing lists

Re: index prefetching - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: index prefetching
Date	July 19, 2025 14:23:21
Msg-id	23f490f4-8325-408c-91a0-a6757ab2441c@vondra.me Whole thread Raw
In response to	Re: index prefetching (Thomas Munro <thomas.munro@gmail.com>)
Responses	Re: index prefetching
List	pgsql-hackers

Tree view

On 7/19/25 06:03, Thomas Munro wrote:
> On Sat, Jul 19, 2025 at 6:31 AM Tomas Vondra <tomas@vondra.me> wrote:
>> Perhaps the ReadStream should do something like this? Of course, the
>> simple patch resets the stream very often, likely mcuh more often than
>> anything else in the code. But wouldn't it be beneficial for streams
>> reset because of a rescan? Possibly needs to be optional.
> 
> Right, that's also discussed, with a similar patch, here:
> 
> https://www.postgresql.org/message-id/CA%2BhUKG%2Bx2BcqWzBC77cN0ewhzMF0kYhC6c4G_T2gJLPbqYQ6Ow%40mail.gmail.com
> 
> Resetting the distance was a short-sighted mistake: I was thinking
> about rescans, the original use case for the reset operation, and
> guessing that the data would remain cached.  But all the new users of
> _reset() have a completely different motivation, namely temporary
> exhaustion in their source data, so that guess was simply wrong.

Thanks for the link. It seems I came up with an almost the same patch,
with three minor differences:

1) There's another place that sets "distance = 0" in
read_stream_next_buffer, so maybe this should preserve the distance too?

2) I suspect we need to preserve the distance at the beginning of
read_stream_reset, like

  stream->reset_distance = Max(stream->reset_distance,
                               stream->distance);

because what if you call _reset before reaching the end of the stream?

3) Shouldn't it reset the reset_distance to 0 after restoring it?

> There was also some discussion at the time about whether "reset so I
> can rescan", and "reset so I can continue after a temporary stop"
> should be different operations requiring different APIs.  It now seems
> like one operation is sufficient, but it should preserve the distance
> as you showed and then let the algorithm learn about already-cached
> data in the rescan case (if it is even true then, which is also
> debatable since it depends on the size of the scan).  So, I think we
> should just go ahead and commit a patch like that.

Not sure. To me it seems more like two distinct cases, but I'm not sure
if it requires two distinct "operations" with distinct API. Perhaps a
simple flag for the _reset() would be enough? It'd need to track the
distance anyway, just in case.

Consider for example a nested loop, which does a rescan every time the
outer row changes. Is there a reason to believe the outer rows will need
the same number of inner rows? Aren't those "distinct streams"? Maybe
I'm thinking about this wrong, of course.

The thing that however concerns me is that what I observed was not the
distance getting reset to 1, and then ramping up. Which should happen
pretty quickly, thanks to the doubling. In my experiments it *never*
ramped up again, it stayed at 1. I still don't quite understand why.

If this is happening for the nestloop case too, that'd be quite bad.

regards

-- 
Tomas Vondra

pgsql-hackers by date:

From: Shlok Kyal
Date: 19 July 2025, 13:47:06
Subject: Re: Skipping schema changes in publication

From: Amit Kapila
Date: 19 July 2025, 14:39:49
Subject: Re: Improve pg_sync_replication_slots() to wait for primary to advance

Re: index prefetching - Mailing list pgsql-hackers

Previous

Next