Re: Use of additional index columns in rows filtering - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Use of additional index columns in rows filtering
Date
Msg-id cb3b1456-f481-44c2-0b82-94298f69a2bf@enterprisedb.com
Whole thread Raw
In response to Re: Use of additional index columns in rows filtering  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Use of additional index columns in rows filtering
Re: Use of additional index columns in rows filtering
List pgsql-hackers

On 8/7/23 02:38, Peter Geoghegan wrote:
> On Sun, Aug 6, 2023 at 3:28 PM Peter Geoghegan <pg@bowt.ie> wrote:
>> I decided to verify my understanding by checking what would happen
>> when I ran the OR-heavy tenk1 regression test query against a
>> combination of your patch, and v7 of the OR-to-SAOP transformation
>> patch. (To be clear, this is without my patch.)
> 
> I also spotted what looks like it might be a problem with your patch
> when looking at this query (hard to be sure if it's truly a bug,
> though).
> 
> I manually SAOP-ify the OR-heavy tenk1 regression test query like so:
> 
> select
>   *
> from
>   tenk1
> where
>   thousand = 42
>   and tenthous in (1, 3, 42);
> 
> Sure enough, I continue to get 7 buffer hits with this query. Just
> like with the BitmapOr plan (and exactly like the original query with
> the OR-to-SAOP transformation patch in place).
> 
> As I continue to add SAOP constants to the original "tenthous" IN(),
> eventually the planner switches over to not using index quals on the
> "tenthous" low order index column (they're only used on the high order
> "thousand" index column). Here's where the switch to only using the
> leading column from the index happens for me:
> 
> select
>   *
> from
>   tenk1
> where
>   thousand = 42
>   and
>   tenthous in (1, 3, 42, 43, 44, 45, 46, 47, 48, 49, 50);
> 
> This plan switchover isn't surprising in itself -- it's one of the most
> important issues addressed by my SAOP patch. However, it *is* a little
> surprising that your patch doesn't even manage to use "Index Filter"
> quals. It appears that it is only capable of using table filter quals.
> Obviously, the index has all the information that expression
> evaluation needs, and yet I see "Filter: (tenk1.tenthous = ANY
> ('{1,3,42,43,44,45,46,47,48,49,50}'::integer[]))". So no improvement
> over master here.
> 
> Interestingly enough, your patch only has this problem with SAOPs, at
> least that I know of -- the spelling/style matters. If I add many
> additional "tenthous" constants to the original version of the query
> from the regression tests in the same way, but using the "longform"
> (tenthous = 1 or tenthous = 3 ...) spelling, then your patch does
> indeed use index filters/expression evaluation. Just like the original
> "risky" plan (it's just a much bigger expression, with many more ORs).
> 

Right. This happens because the matching of SAOP to indexes happens in
multiple places. Firstly, create_index_paths() matches the clauses to
the index by calling

    match_restriction_clauses_to_index
     -> match_clauses_to_index
         -> match_clause_to_index

Which is where we also decide which *unmatched* clauses can be filters.
And this *does* match the SAOP to the index key, hence no index filter.

But then we call get_index_paths/build_index_path a little bit later,
and that decides to skip "lower SAOP" (which seems a bit strange,
because the column is "after" the equality, but meh). Anyway, at this
point we already decided what's a filter, ignoring the index clauses,
and not expecting any backsies.

The simples fix seems to be to add these skipped SAOP clauses as
filters. We know it can be evaluated on the index ...


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Matt Smiley
Date:
Subject: Re: Configurable FP_LOCK_SLOTS_PER_BACKEND
Next
From: Tomas Vondra
Date:
Subject: Re: Configurable FP_LOCK_SLOTS_PER_BACKEND