Re: POC, WIP: OR-clause support for indexes - Mailing list pgsql-hackers

From Robert Haas
Subject Re: POC, WIP: OR-clause support for indexes
Date
Msg-id CA+TgmoaA+OXewcifo7ypr1szb_JP6m-qJp+K8iSenOPi9mtrbg@mail.gmail.com
Whole thread Raw
In response to Re: POC, WIP: OR-clause support for indexes  (Alena Rybakina <a.rybakina@postgrespro.ru>)
Responses Re: POC, WIP: OR-clause support for indexes
Re: POC, WIP: OR-clause support for indexes
List pgsql-hackers
On Thu, Oct 26, 2023 at 3:47 PM Alena Rybakina
<a.rybakina@postgrespro.ru> wrote:
> With small amounts of "OR" elements, the cost of orexpr is lower than with "ANY", on the contrary, higher.

Alexander's example seems to show that it's not that simple. If I'm
reading his example correctly, with things like aid = 1, the
transformation usually wins even if the number of things in the OR
expression is large, but with things like aid + 1 * bid = 1, the
transformation seems to lose at least with larger numbers of items. So
it's not JUST the number of OR elements but also what they contain,
unless I'm misunderstanding his point.

> Index Scan using pg_class_oid_index on pg_class  (cost=0.27..2859.42 rows=414 width=68) (actual time=1.504..34.183
rows=260loops=1) 
>    Index Cond: (oid = ANY (ARRAY['1'::oid, '2'::oid, '3'::oid, '4'::oid, '5'::oid, '6'::oid, '7'::oid,
>
> Bitmap Heap Scan on pg_class  (cost=43835.00..54202.14 rows=414 width=68) (actual time=39.958..41.293 rows=260
loops=1)
>    Recheck Cond: ((oid = '1'::oid) OR (oid = '2'::oid) OR (oid = '3'::oid) OR (oid = '4'::oid) OR (oid =
>
> I think we could see which value is lower, and if lower with expressions converted to ANY, then work with it further,
otherwisework with the original "OR" expressions. But we still need to make this conversion to find out its cost. 

To me, this sort of output suggests that perhaps the transformation is
being done in the wrong place. I expect that we have to decide whether
to convert from OR to = ANY(...) at a very early stage of the planner,
before we have any idea what the selected path will ultimately be. But
this output suggests that we want the answer to depend on which kind
of path is going to be faster, which would seem to argue for doing
this sort of transformation as part of path generation for only those
paths that will benefit from it, rather than during earlier phases of
expression processing/simplification.

I'm not sure I have the full picture here, though, so I might have
this all wrong.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: visibility of open cursors in pg_stat_activity
Next
From: Nathan Bossart
Date:
Subject: Re: CRC32C Parallel Computation Optimization on ARM