Re: Use of additional index columns in rows filtering - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Use of additional index columns in rows filtering
Date
Msg-id CAH2-WzkP7eWfnrk4P4kr6cy1e4MhNXpEuc8YCpBHBa+3FW9cZQ@mail.gmail.com
Whole thread Raw
In response to Re: Use of additional index columns in rows filtering  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Fri, Aug 4, 2023 at 4:47 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> Well, I presented multiple options, so "yes" doesn't really clarify
> which of them applies. But my understanding is you meant the index pages
> accesses.

Sorry. Your understanding of what I must have meant before was correct
-- your patch picked that plan because it reduced the number of index
page accesses significantly. Just like my SAOP patch would have.

> > In short, the dominant cost here is index page accesses. It's a
> > particularly good case for my SAOP patch!
> >
>
> Understood. It makes sense considering the SAOP patch is all about
> optimizing the index walk / processing fewer pages.

Actually, some of the most compelling cases for my SAOP patch are
those involving heap page access savings, which come from the planner
changes. Basically, the nbtree/executor changes make certain index
access patterns much more efficient. Which is useful in itself, but
often much more useful as an indirect enabler of avoiding heap page
accesses by altering other related aspects of a plan in the planner.
Sometimes by replacing table filter quals with index quals (the nbtree
changes make nbtree a strictly better place for the quals). You know,
somewhat like your patch.

That's really why I'm so interested in your patch, and its
relationship with my own patch, and the BitmapOr issue. If your patch
enables some tricks that are really quite similar to the tricks that
my own patch enables, then delineating which patch does which exact
trick when is surely important for both patches.

I actually started out just thinking about index page accesses, before
eventually coming to understand that heap page accesses were also very
relevant. Whereas you started out just thinking about heap page
accesses, and now see some impact from saving on index page accesses.

> Thanks. I think I now see the relationship between the plan with my
> patch and your SAOP patch. It's effectively very similar, except that
> the responsibilities are split a bit differently. With my patch the OR
> clause happens outside AM, while the SAOP patch would do that in the AM
> and also use that to walk the index more efficiently.

Right. That's where my idea of structuring things so that there is
only one best choice really comes from.

> I agree there certainly are cases where the estimates will be off. This
> is not that different from correlated columns, in fact it's exactly the
> same issue, I think.

In one way I think you're right that it's the same issue -- if you
just focus on that one executor node, then it's the same issue. But I
don't think it's the same issue in a deeper sense, since this is one
case where you simply don't have to accept any risk. We really should
be able to just not ever do this, for a limited though important
subset of cases involving ORs + indexable operators.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Release 17 of the PostgreSQL Buildfarm Client
Next
From: Alvaro Herrera
Date:
Subject: Re: cataloguing NOT NULL constraints