Re: BUG #18643: EXPLAIN estimated rows mismatch - Mailing list pgsql-bugs

From Andrei Lepikhov
Subject Re: BUG #18643: EXPLAIN estimated rows mismatch
Date
Msg-id 5abff03e-5eb5-482e-bbd5-809451ff24cb@gmail.com
Whole thread Raw
In response to Re: BUG #18643: EXPLAIN estimated rows mismatch  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On 10/3/24 11:40, Tom Lane wrote:
> Andrei Lepikhov <lepihov@gmail.com> writes:
>> So, we already pass through the OR clauses. Why not to check semi-equal
>> clauses and remove duplicates even if not all clauses are such
>> duplicates? At least, it continually raises users' questions.
> 
> It's difficult to justify spending extra planner cycles to optimize
> what are fundamentally stupidly-written queries.  Who writes "X=X"
> in the first place?  (Other than ORM authors who need to spend some
> time in a re-education camp.)
I don't oppose your arguments at first. I want to say that ORMs and any 
APIs (I sometimes see data analysts who use an AI to request a database) 
are already essential to development (at least in startups). It may not 
be suitable to deal with such cases in the core, but it is too costly to 
do it in an extension right now. In this sense 'silicon' users aren't 
equal to more smart carbon-made ones. And here I see two options:
1. canonicalize_qual_hook - the most blunt approach. Of course, we will 
need more hooks in the near future with this approach.
2. An 'ORM' GUC to enable multiple optimisations in the core, 
'smoothing' users' mistakes.

It would be great to discuss other options.

>  And it would not be a trivial number
> of extra cycles, either.  As pointed out in the commit message
> David mentioned, it's basically free to make this improvement
> when we're looking at a potential EquivalenceClass clause.
> We've already paid the cost of checking that the operator is a btree
> equality operator, and we know that the clause is at top level of
> WHERE (else we couldn't fuzz over the difference between false and
> null results), and besides we have to check whether it's "X=X" because
> not doing so causes some semantic problems for the EquivalenceClass
> machinery.  In the case of a random clause-underneath-OR, we would
> have to make a brand new check whether it's btree equality, and we
> would have to somehow track whether we had descended to an expression
> level where "false and null are known equivalent" is no longer true.
> So I really doubt that a case can be made that that is worth doing.
Thanks, this explanation is quite valuable for me.

-- 
regards, Andrei Lepikhov




pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #18643: EXPLAIN estimated rows mismatch
Next
From: Abhishek Mittal
Date:
Subject: Re: BUG #18624: Memory Leak Issue with PostgreSQL Connection During COPY Command Execution.