On 12/9/2024 03:05, David Rowley wrote:
> On Thu, 27 Jun 2024 at 03:00, Andrei Lepikhov <lepihov@gmail.com> wrote:
>> I tried to invent a simple solution to fight this minor case. But the
>> most clear and straightforward way here is to save a reference to the
>> expression that triggered the PathKey creation inside the PathKey itself.
>> See the sketch of the patch in the attachment.
>> I'm not sure this instability is worth fixing this way, but the
>> dependence of the optimisation outcome on the query text looks buggy.
>
> I don't think that's going to work as that'll mean it'll just choose
> whichever expression was used when the PathKey was first created. For
> your example query, both PathKey's are first created for the GROUP BY
> clause in standard_qp_callback(). I only have to change the GROUP BY
> in your query to use the equivalent column in the other table to get
> it to revert back to the plan you complained about.
Yes, it is true. It is not ideal solution so far - looking for better ideas.
> Maybe doing something with estimate_num_groups() to find the
> EquivalenceClass member with the least distinct values would be
> better. I just can't think how that could be done in a performant way.
Initial problem causes wrong cost_sort estimation. Right now I think
about providing cost_sort() the sort clauses instead of (or in addition
to) the pathkeys.
--
regards, Andrei Lepikhov