Re: BUG #16624: Query Optimizer - Performance bug related to predicate simplification - Mailing list pgsql-bugs

From David Rowley
Subject Re: BUG #16624: Query Optimizer - Performance bug related to predicate simplification
Date
Msg-id CAApHDvp=wzAOq5e6NXeXRuBfnugsXk=NTKiGuWhKb0xf1hiObA@mail.gmail.com
Whole thread Raw
In response to Re: BUG #16624: Query Optimizer - Performance bug related to predicate simplification  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On Sat, 19 Sep 2020 at 09:22, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> PG Bug reporting form <noreply@postgresql.org> writes:
> > We are developing a tool for automatically finding performance bugs in
> > PostgreSQL. Our key insight is that given a pair of semantic equivalent
> > queries, a robust DBMS should return the same result within a similar
> > execution time. Significant time difference suggests a potential performance
> > bug in the DBMS.
>
> Unfortunately, that's a viewpoint that is not going to win you a lot of
> converts.
>
> > First query:
> > SELECT "ps_suppkey"
> > FROM   "partsupp"
> > WHERE  "ps_partkey" = 1486;
>
> > Second query:
> > SELECT "ps_suppkey"
> > FROM   "partsupp"
> > WHERE  "ps_partkey" + 1486 = 2972;
>
> Sure, in theory the optimizer could rearrange that to
> "WHERE ps_partkey = 2972 - 1486" and thus still have an indexable
> condition.  In practice, we have essentially no interest in doing so.
> It would require vastly more semantic knowledge than the optimizer
> has got, and looking for such cases would consume lots of planner
> cycles that would be better spent elsewhere.
>
> If you want to complain that that makes Postgres' optimizer "not
> robust", you're entitled to that opinion.  But it's very unlikely
> to change in the foreseeable future.  We expect that if a user
> cares about the performance of a particular query, they'll be
> willing to write it in a form that the optimizer can deal with.

I'd go a bit further and say that the queries are not even equivalent.

The reason for that is that "ps_partkey" + 1486 = 2972 could cause an
arithmetic overflow ERROR whereas "ps_partkey" = 1486 couldn't.
Perhaps nobody would complain if we didn't throw an ERROR where we
otherwise might have. However, when you consider that you must do 2972
- 1486 to obtain the new constant to compare to the index column, you
could start getting the overflows there instead.  In that case, you
could cause overflows in cases that would have worked perfectly fine
without applying the optimisation.  So it seems to me that the author
the query could have written it that way exactly to eliminate the
chances of such overflows. They might not be happy if we rewrite their
query to make overflows possible again.

David



pgsql-bugs by date:

Previous
From: David Rowley
Date:
Subject: Re: BUG #16625: Query Optimizer - Performance bug related to removal of unnecessary aggregate function
Next
From: Tom Lane
Date:
Subject: Re: BUG #16622: pg_dump produces erroneus ALTER TABLE statement for a table with an inherited generated column