Re: index prefetching - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: index prefetching
Date
Msg-id CA+hUKG+WWr4-8TYemyU=ucQsNe6bUBN_Sq3mCnBoBtxaJ9w3ug@mail.gmail.com
Whole thread Raw
In response to Re: index prefetching  (Tomas Vondra <tomas@vondra.me>)
List pgsql-hackers
On Sat, Jul 19, 2025 at 11:23 PM Tomas Vondra <tomas@vondra.me> wrote:
> Thanks for the link. It seems I came up with an almost the same patch,
> with three minor differences:
>
> 1) There's another place that sets "distance = 0" in
> read_stream_next_buffer, so maybe this should preserve the distance too?
>
> 2) I suspect we need to preserve the distance at the beginning of
> read_stream_reset, like
>
>   stream->reset_distance = Max(stream->reset_distance,
>                                stream->distance);
>
> because what if you call _reset before reaching the end of the stream?
>
> 3) Shouldn't it reset the reset_distance to 0 after restoring it?

Probably.  Hmm... an earlier version of this code didn't use distance
== 0 to indicate end-of-stream, but instead had a separate internal
end_of_stream flag.  If we brought that back and didn't clobber
distance, we wouldn't need this save-and-restore dance.  It seemed
shorter and sweeter without it back then, before _reset() existed in
its present form, but I wonder if end_of_stream would be nicer than
having to add this kind of stuff, without measurable downsides.

> > There was also some discussion at the time about whether "reset so I
> > can rescan", and "reset so I can continue after a temporary stop"
> > should be different operations requiring different APIs.  It now seems
> > like one operation is sufficient, but it should preserve the distance
> > as you showed and then let the algorithm learn about already-cached
> > data in the rescan case (if it is even true then, which is also
> > debatable since it depends on the size of the scan).  So, I think we
> > should just go ahead and commit a patch like that.
>
> Not sure. To me it seems more like two distinct cases, but I'm not sure
> if it requires two distinct "operations" with distinct API. Perhaps a
> simple flag for the _reset() would be enough? It'd need to track the
> distance anyway, just in case.
>
> Consider for example a nested loop, which does a rescan every time the
> outer row changes. Is there a reason to believe the outer rows will need
> the same number of inner rows? Aren't those "distinct streams"? Maybe
> I'm thinking about this wrong, of course.

Good question.  Yeah, your flag idea seems like a good way to avoid
baking opinion into this level.  I wonder if it should be a bitmask
rather than a boolean, in case we think of more things that need to be
included or not when resetting.

> The thing that however concerns me is that what I observed was not the
> distance getting reset to 1, and then ramping up. Which should happen
> pretty quickly, thanks to the doubling. In my experiments it *never*
> ramped up again, it stayed at 1. I still don't quite understand why.

Huh.  Will look into that on Monday.



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Document slot's restart_lsn can go backward
Next
From: jian he
Date:
Subject: get_rule_expr RelabelType node does not print COLLATE clause