Re: index prefetching - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: index prefetching |
Date | |
Msg-id | CAAKRu_btSd9taRfwthMNKYxyAGw_AG6mF46OyQ42hBOACxGMLw@mail.gmail.com Whole thread Raw |
In response to | Re: index prefetching (Melanie Plageman <melanieplageman@gmail.com>) |
List | pgsql-hackers |
On Wed, Feb 14, 2024 at 11:40 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Tue, Feb 13, 2024 at 2:01 PM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: > > > > On 2/7/24 22:48, Melanie Plageman wrote: > > > ... > > > - switching scan directions > > > > > > If the index scan switches directions on a given invocation of > > > IndexNext(), heap blocks may have already been prefetched and read for > > > blocks containing tuples beyond the point at which we want to switch > > > directions. > > > > > > We could fix this by having some kind of streaming read "reset" > > > callback to drop all of the buffers which have been prefetched which > > > are now no longer needed. We'd have to go backwards from the last TID > > > which was yielded to the caller and figure out which buffers in the > > > pgsr buffer ranges are associated with all of the TIDs which were > > > prefetched after that TID. The TIDs are in the per_buffer_data > > > associated with each buffer in pgsr. The issue would be searching > > > through those efficiently. > > > > > > > Yeah, that's roughly what I envisioned in one of my previous messages > > about this issue - walking back the TIDs read from the index and added > > to the prefetch queue. > > > > > The other issue is that the streaming read API does not currently > > > support backwards scans. So, if we switch to a backwards scan from a > > > forwards scan, we would need to fallback to the non streaming read > > > method. We could do this by just setting the TID queue size to 1 > > > (which is what I have currently implemented). Or we could add > > > backwards scan support to the streaming read API. > > > > > > > What do you mean by "support for backwards scans" in the streaming read > > API? I imagined it naively as > > > > 1) drop all requests in the streaming read API queue > > > > 2) walk back all "future" requests in the TID queue > > > > 3) start prefetching as if from scratch > > > > Maybe there's a way to optimize this and reuse some of the work more > > efficiently, but my assumption is that the scan direction does not > > change very often, and that we process many items in between. > > Yes, the steps you mention for resetting the queues make sense. What I > meant by "backwards scan is not supported by the streaming read API" > is that Thomas/Andres had mentioned that the streaming read API does > not support backwards scans right now. Though, since the callback just > returns a block number, I don't know how it would break. > > When switching between a forwards and backwards scan, does it go > backwards from the current position or start at the end (or beginning) > of the relation? Okay, well I answered this question for myself, by, um, trying it :). FETCH backward will go backwards from the current cursor position. So, I don't see exactly why this would be an issue. > If it is the former, then the blocks would most > likely be in shared buffers -- which the streaming read API handles. > It is not obvious to me from looking at the code what the gap is, so > perhaps Thomas could weigh in. I have the same problem with the sequential scan streaming read user, so I am going to try and figure this backwards scan and switching scan direction thing there (where we don't have other issues). - Melanie
pgsql-hackers by date: