On 30.10.2023 17:06, Alexander Korotkov wrote:
> On Mon, Oct 30, 2023 at 3:40 PM Robert Haas <robertmhaas@gmail.com> wrote:
>> On Thu, Oct 26, 2023 at 5:05 PM Peter Geoghegan <pg@bowt.ie> wrote:
>>> On Thu, Oct 26, 2023 at 12:59 PM Robert Haas <robertmhaas@gmail.com> wrote:
>>>> Alexander's example seems to show that it's not that simple. If I'm
>>>> reading his example correctly, with things like aid = 1, the
>>>> transformation usually wins even if the number of things in the OR
>>>> expression is large, but with things like aid + 1 * bid = 1, the
>>>> transformation seems to lose at least with larger numbers of items. So
>>>> it's not JUST the number of OR elements but also what they contain,
>>>> unless I'm misunderstanding his point.
>>> Alexander said "Generally, I don't see why ANY could be executed
>>> slower than the equivalent OR clause". I understood that this was his
>>> way of expressing the following idea:
>>>
>>> "In principle, there is no reason to expect execution of ANY() to be
>>> slower than execution of an equivalent OR clause (except for
>>> noise-level differences). While it might not actually look that way
>>> for every single type of plan you can imagine right now, that doesn't
>>> argue for making a cost-based decision. It actually argues for fixing
>>> the underlying issue, which can't possibly be due to some kind of
>>> fundamental advantage enjoyed by expression evaluation with ORs".
>>>
>>> This is also what I think of all this.
>> I agree with that, with some caveats, mainly that the reverse is to
>> some extent also true. Maybe not completely, because arguably the
>> ANY() formulation should just be straight-up easier to deal with, but
>> in principle, the two are equivalent and it shouldn't matter which
>> representation we pick.
>>
>> But practically, it may, and we need to be sure that we don't put in
>> place a translation that is theoretically a win but in practice leads
>> to large regressions. Avoiding regressions here is more important than
>> capturing all the possible gains. A patch that wins in some scenarios
>> and does nothing in others can be committed; a patch that wins in even
>> more scenarios but causes serious regressions in some cases probably
>> can't.
> +1
> Sure, I've identified two cases where patch shows regression [1]. The
> first one (quadratic complexity of expression processing) should be
> already addressed by usage of hash. The second one (planning
> regression with Bitmap OR) is not yet addressed.
>
> Links
> 1. https://www.postgresql.org/message-id/CAPpHfduJtO0s9E%3DSHUTzrCD88BH0eik0UNog1_q3XBF2wLmH6g%40mail.gmail.com
>
I also support this approach. I have almost finished writing a patch
that fixes the first problem related to the quadratic complexity of
processing expressions by adding a hash table.
I also added a check: if the number of groups is equal to the number of
OR expressions, we assume that no expressions need to be converted and
interrupt further execution.
Now I am trying to fix the last problem in this patch: three tests have
indicated a problem related to incorrect conversion. I don't think it
can be serious, but I haven't figured out where the mistake is yet.
I added log like that: ERROR: unrecognized node type: 0.
--
Regards,
Alena Rybakina
Postgres Professional