Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Date
Msg-id CAH2-WznprRoyeK1FmxpV6wHZESrjimmvv1J1hTwKuvMbah8LMA@mail.gmail.com
Whole thread Raw
In response to Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Responses Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
List pgsql-hackers
On Thu, Jul 27, 2023 at 7:59 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
> > Basically, the patch that added that feature had to revise the index
> > AM API, in order to support a mode of operation where scans return
> > groupings rather than tuples. Whereas this patch requires none of
> > that. It makes affected index scans as similar as possible to
> > conventional index scans.
>
> Hmm, yes. I see now where my confusion started. You called it out in
> your first paragraph of the original mail, too, but that didn't help
> me then:
>
> The wiki does not distinguish "Index Skip Scans" and "Loose Index
> Scans", but these are not the same.

A lot of people (myself included) were confused on this point for
quite a while. To make matters even more confusing, one of the really
compelling cases for the MDAM design is scans that feed into
GroupAggregates -- preserving index sort order for naturally big index
scans will tend to enable it. One of my examples from the start of
this thread showed just that. (It just so happened that that example
was faster because of all the "skipping" that nbtree *wasn't* doing
with the patch.)

> Yes, that's why I asked: The MDAM paper's examples seem to materialize
> the full predicate up-front, which would require a product of all
> indexed columns' quals in size, so that materialization has a good
> chance to get really, really large. But if we're not doing that
> materialization upfront, then there is no issue with resource
> consumption (except CPU time, which can likely be improved with other
> methods)

I get why you asked. I might have asked the same question.

As I said, the MDAM paper has *surprisingly* little to say about
B-Tree executor stuff -- it's almost all just describing the
preprocessing/transformation process. It seems as if optimizations
like the one from my patch were considered too obvious to talk about
and/or out of scope by the authors. Thinking about the MDAM paper like
that was what made everything fall into place for me. Remember,
"missing key predicates" isn't all that special.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Next
From: Ashutosh Bapat
Date:
Subject: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning