I will continue to put pressure on the vendor (Stitch Data, if anyone knows folks there) to address the issue on their end with the query being issued.
Best,
Ben Chrobot
On Tue, Feb 14, 2023 at 11:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowleyml@gmail.com> writes: > On Wed, 15 Feb 2023 at 11:39, Peter J. Holzer <hjp-pgsql@hjp.at> wrote: >> OTOH it could also be argued that the optimizer should be able to >> perform the same simplifications as I did above and produce the same >> code for WHERE (("id" > ? OR "id" IS NULL)) AND (("id" <= ?)) >> as for WHERE (("id" > ?)) AND (("id" <= ?)).
> You're right, and it has been brought up quite a few times in the > past. To make it work, it's a fairly trivial change. We'd just need > to record all the attnotnull columns during something like > get_relation_info() then when adding baserestrictinfos to the base > relations, we could look to see if the qual is a NullTest and skip > that if we deem the qual as constantly true.
There's an order-of-operations issue that makes this more painful than you might think at first. In the above example, the NullTest node *isn't* going to be a top-level restrictinfo: it's buried inside an OR. Really, the only reasonable place to suppress such a NullTest is during eval_const_expressions, which already has the logic that would get rid of the now-unnecessary OR above it. And that's problematic because it's done way ahead of where we know any relation-specific information. (Since eval_const_expressions happens ahead of join removal, for $good_reasons, moving the plancat.c fetching to someplace earlier than that wouldn't be cost-free either.)
> The problem with that is that doing that has an above zero cost and > since it likely only applies to nearly zero real-world cases, it just > does not seem like useful cycles to add to the planner.
Yeah, this. In the end there is a low threshold on expensive stuff that we're willing to do to clean up after brain-dead ORMs, because the costs of that will also be paid by not-so-brain-dead applications. In the example at hand, it's hard to argue that the query generator sending this query shouldn't know better, since as Peter points out the IS NULL check is redundant on its face, primary key or not.