Re: POC, WIP: OR-clause support for indexes - Mailing list pgsql-hackers
From | Alexander Korotkov |
---|---|
Subject | Re: POC, WIP: OR-clause support for indexes |
Date | |
Msg-id | CAPpHfdtGmUHv=E4QWGWHk2QUuJ-x0mTNBssW_Ehv2dcOQ2L66g@mail.gmail.com Whole thread Raw |
In response to | Re: POC, WIP: OR-clause support for indexes (Andrei Lepikhov <a.lepikhov@postgrespro.ru>) |
Responses |
Re: POC, WIP: OR-clause support for indexes
Re: POC, WIP: OR-clause support for indexes |
List | pgsql-hackers |
Hi! On Mon, Apr 1, 2024 at 9:38 AM Andrei Lepikhov <a.lepikhov@postgrespro.ru> wrote: > On 28/3/2024 16:54, Alexander Korotkov wrote: > > The current patch has a boolean guc enable_or_transformation. > > However, when we have just a few ORs to be transformated, then we > > should get less performance gain from the transformation and higher > > chances to lose a good bitmap scan plan from that. When there is a > > huge list of ORs to be transformed, then the performance gain is > > greater and it is less likely we could lose a good bitmap scan plan. > > > > What do you think about introducing a GUC threshold value: the minimum > > size of list to do OR-to-ANY transformation? > > min_list_or_transformation or something. > I labelled it or_transformation_limit (see in attachment). Feel free to > rename it. > It's important to note that the limiting GUC doesn't operate > symmetrically for forward, OR -> SAOP, and backward SAOP -> OR > operations. In the forward case, it functions as you've proposed. > However, in the backward case, we only check whether the feature is > enabled or not. This is due to our existing limitation, > MAX_SAOP_ARRAY_SIZE, and the fact that we can't match the length of the > original OR list with the sizes of the resulting SAOPs. For instance, a > lengthy OR list with 100 elements can be transformed into 3 SAOPs, each > with a size of around 30 elements. > One aspect that requires attention is the potential inefficiency of our > OR -> ANY transformation when we have a number of elements less than > MAX_SAOP_ARRAY_SIZE. This is because we perform a reverse transformation > ANY -> OR at the stage of generating bitmap scans. If the BitmapScan > path dominates, we may have done unnecessary work. Is this an occurrence > that we should address? > But the concern above may just be a point of improvement later: We can > add one more strategy to the optimizer: testing each array element as an > OR clause; we can also provide a BitmapOr path, where SAOP is covered > with a minimal number of partial indexes (likewise, previous version). I've revised the patch. Did some beautification, improvements for documentation, commit messages etc. I've pushed the 0001 patch without 0002. I think 0001 is good by itself given that there is the or_to_any_transform_limit GUC option. The more similar OR clauses are here the more likely grouping them into SOAP will be a win. But I've changed the default value to 5. This will make it less invasive and affect only queries with obvious repeating patterns. That also reduced the changes in the regression tests expected outputs. Regarding 0002, it seems questionable since it could cause a planning slowdown for SAOP's with large arrays. Also, it might reduce the win of transformation made by 0001. So, I think we should skip it for now. ------ Regards, Alexander Korotkov
pgsql-hackers by date: