Re: Use of additional index columns in rows filtering - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Use of additional index columns in rows filtering |
Date | |
Msg-id | CAH2-Wzkh9udhHPwTK_XJvrcWwNp50FeZ_-1sr2UYCGAoH9n0Jw@mail.gmail.com Whole thread Raw |
In response to | Re: Use of additional index columns in rows filtering (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Use of additional index columns in rows filtering
|
List | pgsql-hackers |
On Sun, Aug 6, 2023 at 1:13 PM Peter Geoghegan <pg@bowt.ie> wrote: > Since you're not relying on the nbtree work at all here, really (just > on the transformation process itself), the strategic risk that this > adds to your project isn't too great. It's not like this ties the > success of your patch to the success of my own patch. At most it ties > the success of your patch to something like Alena Rybakina's > OR-to-SAOP transformation patch, which seems manageable to me. (To be > clear, I'm not relying on that work in the same way myself -- for my > patch the transformation process is just a nice-to-have.) I decided to verify my understanding by checking what would happen when I ran the OR-heavy tenk1 regression test query against a combination of your patch, and v7 of the OR-to-SAOP transformation patch. (To be clear, this is without my patch.) I found that the problem that I saw with the OR-heavy tenk1 regression test goes away (though only when I "set or_transform_limit=0"). That is, we'll get an index scan plan that uses a SAOP. This index scan plan is comparable to the master branch's BitmapOr scan. In particular, both plans get 7 buffer hits. More importantly, the new plan is (like the master branch plan) not risky in the way I've been going on about. This does mean that your patch gets a *slightly* slower plan, due to the issue of added index page accesses. Improving that should be a job for my patch -- it's not your problem, since there is no regression. I'm not sure if it's somehow still possible that SAOP expression evaluation is able to "do the risky thing" in the same way that your patch's "Index Filter: ((tenk1.tenthous = 1) OR (tenk1.tenthous = 3) OR (tenk1.tenthous = 42))" plan. But it certainly looks like it can't. Increasingly, the problem here appears to me to be a problem of lacking useful CNF transformations/normalization -- nothing more. Structuring things so that we reliably use "the native representation of ORs" via normalization seems likely to be all you really need. We may currently be over relying on a similar process that happens indirectly, via BitmapOr paths. I suspect that you're right to be concerned about how this might already be affecting index-only scans. Once we have CNF normalization/transformation in place, we will of course continue to use some BitmapOr plans that may look a little like the ones I'm focussed on, and some plans that use "index filters" (due to your patch) that are also a little like that. But there's nothing objectionable about those cases IMV (quite the opposite), since there is no question of displacing/out-competing very similar plans that can use index quals. (You might also find a way to avoid ever requiring heap access/visibility checks for a subset of the "index filter" cases where it is determined to be safe up front, but that's just a bonus.) -- Peter Geoghegan
pgsql-hackers by date: