Home > mailing lists

Re: Use of additional index columns in rows filtering - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: Use of additional index columns in rows filtering
Date	August 4, 2023 17:01:29
Msg-id	CAH2-WzkP7eWfnrk4P4kr6cy1e4MhNXpEuc8YCpBHBa+3FW9cZQ@mail.gmail.com Whole thread Raw
In response to	Re: Use of additional index columns in rows filtering (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List	pgsql-hackers

Tree view

On Fri, Aug 4, 2023 at 4:47 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> Well, I presented multiple options, so "yes" doesn't really clarify
> which of them applies. But my understanding is you meant the index pages
> accesses.

Sorry. Your understanding of what I must have meant before was correct
-- your patch picked that plan because it reduced the number of index
page accesses significantly. Just like my SAOP patch would have.

> > In short, the dominant cost here is index page accesses. It's a
> > particularly good case for my SAOP patch!
> >
>
> Understood. It makes sense considering the SAOP patch is all about
> optimizing the index walk / processing fewer pages.

Actually, some of the most compelling cases for my SAOP patch are
those involving heap page access savings, which come from the planner
changes. Basically, the nbtree/executor changes make certain index
access patterns much more efficient. Which is useful in itself, but
often much more useful as an indirect enabler of avoiding heap page
accesses by altering other related aspects of a plan in the planner.
Sometimes by replacing table filter quals with index quals (the nbtree
changes make nbtree a strictly better place for the quals). You know,
somewhat like your patch.

That's really why I'm so interested in your patch, and its
relationship with my own patch, and the BitmapOr issue. If your patch
enables some tricks that are really quite similar to the tricks that
my own patch enables, then delineating which patch does which exact
trick when is surely important for both patches.

I actually started out just thinking about index page accesses, before
eventually coming to understand that heap page accesses were also very
relevant. Whereas you started out just thinking about heap page
accesses, and now see some impact from saving on index page accesses.

> Thanks. I think I now see the relationship between the plan with my
> patch and your SAOP patch. It's effectively very similar, except that
> the responsibilities are split a bit differently. With my patch the OR
> clause happens outside AM, while the SAOP patch would do that in the AM
> and also use that to walk the index more efficiently.

Right. That's where my idea of structuring things so that there is
only one best choice really comes from.

> I agree there certainly are cases where the estimates will be off. This
> is not that different from correlated columns, in fact it's exactly the
> same issue, I think.

In one way I think you're right that it's the same issue -- if you
just focus on that one executor node, then it's the same issue. But I
don't think it's the same issue in a deeper sense, since this is one
case where you simply don't have to accept any risk. We really should
be able to just not ever do this, for a limited though important
subset of cases involving ORs + indexable operators.

--
Peter Geoghegan

pgsql-hackers by date:

From: Andrew Dunstan
Date: 04 August 2023, 15:23:11
Subject: Release 17 of the PostgreSQL Buildfarm Client

From: Alvaro Herrera
Date: 04 August 2023, 18:10:42
Subject: Re: cataloguing NOT NULL constraints

Re: Use of additional index columns in rows filtering - Mailing list pgsql-hackers

Previous

Next