Re: Use of additional index columns in rows filtering - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Use of additional index columns in rows filtering
Date
Msg-id CAH2-Wzkz9bGQ8858Z_7JNp0uWGqKq6+BzNDKhEO__P7auR9khg@mail.gmail.com
Whole thread Raw
In response to Re: Use of additional index columns in rows filtering  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Use of additional index columns in rows filtering
List pgsql-hackers
On Sun, Aug 6, 2023 at 3:28 PM Peter Geoghegan <pg@bowt.ie> wrote:
> I decided to verify my understanding by checking what would happen
> when I ran the OR-heavy tenk1 regression test query against a
> combination of your patch, and v7 of the OR-to-SAOP transformation
> patch. (To be clear, this is without my patch.)

I also spotted what looks like it might be a problem with your patch
when looking at this query (hard to be sure if it's truly a bug,
though).

I manually SAOP-ify the OR-heavy tenk1 regression test query like so:

select
  *
from
  tenk1
where
  thousand = 42
  and tenthous in (1, 3, 42);

Sure enough, I continue to get 7 buffer hits with this query. Just
like with the BitmapOr plan (and exactly like the original query with
the OR-to-SAOP transformation patch in place).

As I continue to add SAOP constants to the original "tenthous" IN(),
eventually the planner switches over to not using index quals on the
"tenthous" low order index column (they're only used on the high order
"thousand" index column). Here's where the switch to only using the
leading column from the index happens for me:

select
  *
from
  tenk1
where
  thousand = 42
  and
  tenthous in (1, 3, 42, 43, 44, 45, 46, 47, 48, 49, 50);

This plan switchover isn't surprising in itself -- it's one of the most
important issues addressed by my SAOP patch. However, it *is* a little
surprising that your patch doesn't even manage to use "Index Filter"
quals. It appears that it is only capable of using table filter quals.
Obviously, the index has all the information that expression
evaluation needs, and yet I see "Filter: (tenk1.tenthous = ANY
('{1,3,42,43,44,45,46,47,48,49,50}'::integer[]))". So no improvement
over master here.

Interestingly enough, your patch only has this problem with SAOPs, at
least that I know of -- the spelling/style matters. If I add many
additional "tenthous" constants to the original version of the query
from the regression tests in the same way, but using the "longform"
(tenthous = 1 or tenthous = 3 ...) spelling, then your patch does
indeed use index filters/expression evaluation. Just like the original
"risky" plan (it's just a much bigger expression, with many more ORs).

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Sync scan & regression tests
Next
From: Tom Lane
Date:
Subject: Re: Sync scan & regression tests