Re: Query plan for "id IS NULL" on PK - Mailing list pgsql-general

From Ben Chrobot
Subject Re: Query plan for "id IS NULL" on PK
Date
Msg-id CALRaRxxG5wJiSb51yn30hAn3USMaXDTueAAFecCCw_wLANru-g@mail.gmail.com
Whole thread Raw
In response to Re: Query plan for "id IS NULL" on PK  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Thank you all for your responses!

I will continue to put pressure on the vendor (Stitch Data, if anyone knows folks there) to address the issue on their end with the query being issued.

Best,
Ben Chrobot


On Tue, Feb 14, 2023 at 11:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowleyml@gmail.com> writes:
> On Wed, 15 Feb 2023 at 11:39, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
>> OTOH it could also be argued that the optimizer should be able to
>> perform the same simplifications as I did above and produce the same
>> code for WHERE (("id" > ? OR "id" IS NULL)) AND (("id" <= ?))
>> as for WHERE (("id" > ?)) AND (("id" <= ?)).

> You're right, and it has been brought up quite a few times in the
> past.  To make it work, it's a fairly trivial change. We'd just need
> to record all the attnotnull columns during something like
> get_relation_info() then when adding baserestrictinfos to the base
> relations, we could look to see if the qual is a NullTest and skip
> that if we deem the qual as constantly true.

There's an order-of-operations issue that makes this more painful
than you might think at first.  In the above example, the NullTest
node *isn't* going to be a top-level restrictinfo: it's buried inside
an OR.  Really, the only reasonable place to suppress such a NullTest
is during eval_const_expressions, which already has the logic that would
get rid of the now-unnecessary OR above it.  And that's problematic
because it's done way ahead of where we know any relation-specific
information.  (Since eval_const_expressions happens ahead of join
removal, for $good_reasons, moving the plancat.c fetching to someplace
earlier than that wouldn't be cost-free either.)

> The problem with that is that doing that has an above zero cost and
> since it likely only applies to nearly zero real-world cases, it just
> does not seem like useful cycles to add to the planner.

Yeah, this.  In the end there is a low threshold on expensive stuff
that we're willing to do to clean up after brain-dead ORMs, because
the costs of that will also be paid by not-so-brain-dead applications.
In the example at hand, it's hard to argue that the query generator
sending this query shouldn't know better, since as Peter points out
the IS NULL check is redundant on its face, primary key or not.

                        regards, tom lane


pgsql-general by date:

Previous
From: Marc Millas
Date:
Subject: Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?
Next
From: Arthur Ramsey
Date:
Subject: Sequential scan faster than index