Re: Reordering DISTINCT keys to match input path's pathkeys - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: Reordering DISTINCT keys to match input path's pathkeys
Date
Msg-id e56a0dbd-c969-4a69-b5c3-b640bc8ebc8a@gmail.com
Whole thread Raw
In response to Reordering DISTINCT keys to match input path's pathkeys  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: Reordering DISTINCT keys to match input path's pathkeys
List pgsql-hackers
On 11/13/24 13:49, Richard Guo wrote:
> On Mon, Oct 28, 2024 at 6:15 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>> On 6/7/24 16:46, Richard Guo wrote:
>>> This patch does not apply any more, so here is a new rebase, with some
>>> tweaks to the comments.
> 
>> This patch needs a minor rebase again.
>> After skimming the code, I want to say that it looks good. But maybe to
>> avoid one more *_reordering GUC - it would be better to cover all path
>> key reorderings under a single GUC.
> 
> Thanks for reviewing this patch.  After some consideration, I think
> it's not too complex to also apply this optimization to DISTINCT ON.
> The parser already ensures that the DISTINCT ON expressions match the
> initial ORDER BY expressions; we just need to ensure that the
> resulting pathkey list from the reordering matches the original
> distinctClause pathkeys, while leaving the remaining pathkeys
> unchanged in order.  Please see attached.
Thanks, I'll discover it later.
BTW have you ever thought about one more, cost-based reordering strategy?
For now, we can reorder GROUP-BY and distinct clauses according to two 
orderings: 1) ORDER-BY order and 2) order derived from the underlying 
query tree.
In thread [1], I try to add one more strategy that minimises the number 
of comparison operator calls. It seems that it would work the same way 
with the DISTINCT statement. Do you think it make sense in general and 
can be a possible direction of improvement for the current patch?

> 
> I'm not sure about merging these two 'reordering' GUCs into one.
> While they may look similar, they apply to very different scenarios.
> However, I'm open to other suggestions.
Sure, they enable different optimisations. But, they enable highly 
specialised optimisations. Having two GUCs looks too expensive.
Moreover, this stuff is cost-based and should work automatically. So, I 
treat these GUCs as mostly debugging or last-chance stuff used to 
disable features during severe slowdowns or bugs. It might make sense to 
group them into a single 'Clause Reordering' parameter.

[1] 
https://www.postgresql.org/message-id/flat/8742aaa8-9519-4a1f-91bd-364aec65f5cf%40gmail.com

-- 
regards, Andrei Lepikhov



pgsql-hackers by date:

Previous
From: wenhui qiu
Date:
Subject: Re: New GUC autovacuum_max_threshold ?
Next
From: Peter Eisentraut
Date:
Subject: Re: SQL:2011 application time