Re: BUG #17964: Missed query planner optimization - Mailing list pgsql-bugs

From David Rowley
Subject Re: BUG #17964: Missed query planner optimization
Date
Msg-id CAApHDvqiZSEoJ1fGSkBLTY9cTQJAmjUgUH8PziFP0uRMr-ARLw@mail.gmail.com
Whole thread Raw
In response to Re: BUG #17964: Missed query planner optimization  (Mathias Kunter <mathiaskunter@gmail.com>)
Responses Re: BUG #17964: Missed query planner optimization  (Mathias Kunter <mathiaskunter@gmail.com>)
List pgsql-bugs
On Fri, 16 Jun 2023 at 03:53, Mathias Kunter <mathiaskunter@gmail.com> wrote:
> > SELECT * FROM book WHERE
> > author_id IN (SELECT id FROM author WHERE name = 'some_name') OR
> > publisher_id IN (SELECT id FROM publisher WHERE name = 'some_name');
>
> Complete example here: https://dbfiddle.uk/q6_4NuDX
>
> The issue could be fixed quite easily by implementing a heuristic, the
> optimized query will execute a few THOUSAND times faster, most people
> have no clue that they could use ANY(ARRAY()) as a workaround, and still
> this optimization isn't something worth to be implemented?

There are some cases where converting the IN to a semi-join will
significantly improve performance, and as you've shown, there are some
cases where that optimisation causes performance regressions.  We'd
need a way to determine which is cheaper based on estimated costs and
that might be, in my opinion, fairly tricky to do based on the order
of how things currently are done in the query planner.

In my previous emails about this, I was just trying to show that it's
quite a tricky problem to fix properly.  We're certainly open to
patches which fix some of the problems you've mentioned. If you've
looked into it and think it's quite an easy project, then we'll
welcome contributions to improve this. Having the planner consider the
costs of converting the IN to a semi-join and converting or not based
on cost seems like a worthy goal.  The pgsql-hackers as a good
emailing list to bring up the topic and gather up some ideas on how
you might go about fixing it.

David



pgsql-bugs by date:

Previous
From: Jeff Davis
Date:
Subject: Re: pg_dump assertion failure with "-n pg_catalog"
Next
From: Michael Paquier
Date:
Subject: Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon