Re: Use of additional index columns in rows filtering - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Use of additional index columns in rows filtering |
Date | |
Msg-id | cb3b1456-f481-44c2-0b82-94298f69a2bf@enterprisedb.com Whole thread Raw |
In response to | Re: Use of additional index columns in rows filtering (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Use of additional index columns in rows filtering
Re: Use of additional index columns in rows filtering |
List | pgsql-hackers |
On 8/7/23 02:38, Peter Geoghegan wrote: > On Sun, Aug 6, 2023 at 3:28 PM Peter Geoghegan <pg@bowt.ie> wrote: >> I decided to verify my understanding by checking what would happen >> when I ran the OR-heavy tenk1 regression test query against a >> combination of your patch, and v7 of the OR-to-SAOP transformation >> patch. (To be clear, this is without my patch.) > > I also spotted what looks like it might be a problem with your patch > when looking at this query (hard to be sure if it's truly a bug, > though). > > I manually SAOP-ify the OR-heavy tenk1 regression test query like so: > > select > * > from > tenk1 > where > thousand = 42 > and tenthous in (1, 3, 42); > > Sure enough, I continue to get 7 buffer hits with this query. Just > like with the BitmapOr plan (and exactly like the original query with > the OR-to-SAOP transformation patch in place). > > As I continue to add SAOP constants to the original "tenthous" IN(), > eventually the planner switches over to not using index quals on the > "tenthous" low order index column (they're only used on the high order > "thousand" index column). Here's where the switch to only using the > leading column from the index happens for me: > > select > * > from > tenk1 > where > thousand = 42 > and > tenthous in (1, 3, 42, 43, 44, 45, 46, 47, 48, 49, 50); > > This plan switchover isn't surprising in itself -- it's one of the most > important issues addressed by my SAOP patch. However, it *is* a little > surprising that your patch doesn't even manage to use "Index Filter" > quals. It appears that it is only capable of using table filter quals. > Obviously, the index has all the information that expression > evaluation needs, and yet I see "Filter: (tenk1.tenthous = ANY > ('{1,3,42,43,44,45,46,47,48,49,50}'::integer[]))". So no improvement > over master here. > > Interestingly enough, your patch only has this problem with SAOPs, at > least that I know of -- the spelling/style matters. If I add many > additional "tenthous" constants to the original version of the query > from the regression tests in the same way, but using the "longform" > (tenthous = 1 or tenthous = 3 ...) spelling, then your patch does > indeed use index filters/expression evaluation. Just like the original > "risky" plan (it's just a much bigger expression, with many more ORs). > Right. This happens because the matching of SAOP to indexes happens in multiple places. Firstly, create_index_paths() matches the clauses to the index by calling match_restriction_clauses_to_index -> match_clauses_to_index -> match_clause_to_index Which is where we also decide which *unmatched* clauses can be filters. And this *does* match the SAOP to the index key, hence no index filter. But then we call get_index_paths/build_index_path a little bit later, and that decides to skip "lower SAOP" (which seems a bit strange, because the column is "after" the equality, but meh). Anyway, at this point we already decided what's a filter, ignoring the index clauses, and not expecting any backsies. The simples fix seems to be to add these skipped SAOP clauses as filters. We know it can be evaluated on the index ... regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: