Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Date
Msg-id CAH2-Wz=Ts_kouW0WJ=Us_DdN1TuLgEFxQS1dBrGicwGvZr8gOQ@mail.gmail.com
Whole thread Raw
In response to Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
List pgsql-hackers
On Thu, Jul 27, 2023 at 10:00 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
> My idea is not quite block nested loop join. It's more 'restart the
> index scan at the location the previous index scan ended, if
> heuristics say there's a good chance that might save us time'. I'd say
> it is comparable to the fast tree descent optimization that we have
> for endpoint queries, and comparable to this patch's scankey
> optimization, but across AM-level rescans instead of internal rescans.

Yeah, I see what you mean. Seems related, even though what you've
shown in your prototype patch doesn't seem like it fits into my
taxonomy very neatly.

(BTW, I was a little confused by the use of the term "endpoint" at
first, since there is a function that uses that term to refer to a
descent of the tree that happens without any insertion scan key. This
path is used whenever the best we can do in _bt_first is to descend to
the rightmost or leftmost page.)

> The basic design of that patch is this: We keep track of how many
> times we've rescanned, and the end location of the index scan. If a
> new index scan hits the same page after _bt_search as the previous
> scan ended, we register that.

I can see one advantage that block nested loop join would retain here:
it does block-based accesses on both sides of the join. Since it
"looks ahead" on both sides of the join, more repeat accesses are
likely to be avoided.

Not too sure how much that matters in practice, though.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Dave Cramer
Date:
Subject: Re: Request for comment on setting binary format output per session
Next
From: Alena Rybakina
Date:
Subject: Re: POC, WIP: OR-clause support for indexes