Re: [PATCH] Teach planner to further optimize sort in distinct - Mailing list pgsql-hackers

From Ankit Kumar Pandey
Subject Re: [PATCH] Teach planner to further optimize sort in distinct
Date
Msg-id ca9cb365-0d6c-ae89-1e7b-76880c46a124@gmail.com
Whole thread Raw
In response to Re: [PATCH] Teach planner to further optimize sort in distinct  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: [PATCH] Teach planner to further optimize sort in distinct
List pgsql-hackers
> On 19/01/23 18:49, David Rowley wrote:

> I think you should write a function like:

> bool pathkeys_count_contained_in_unordered(List *keys1, List *keys2,
> List **reorderedkeys, int *n_common)

> which works very similarly to pathkeys> _count_contained_in, but
> populates *reorderedkeys so it contains all of the keys in keys1, but
> put the matching ones in the same order as they are in keys2.  If you
> find a keys2 that does not exist in keys1 then just add the additional
> unmatched keys1 keys to *reorderedkeys.  Set *n_common to the number
> of common keys excluding any that come after a key2 key that does not
> exist as a key1 key.

> You can just switch to using that function in
> create_final_distinct_paths(). You'll need to consider if the query is
> a DISTINCT ON query and not try the unordered version of the function
> in that case.

Tried this, it worked as expected. Tests are green as well.

> I also just noticed that in build_index_paths() we'll leave the index
> path's pathkeys empty if we deem the pathkeys as useless.  I'm not
> sure what the repercussions of setting those to the return value of
> build_index_pathkeys() if useful_pathkeys is otherwise empty.

This is very rigid indeed.

> It's possible that truncate_useless_pathkeys() needs to be modified to
> check if the pathkeys might be useful for DISTINCT 

We have pathkeys_useful_for_merging and pathkeys_useful_for_ordering.

Can we not have pathkeys_useful_for_distinct?

Also, pathkeys_useful_for_ordering calls pathkeys_count_contained_in.

We can add code path on similar lines.

> When I
> thought about it I assumed that we always set IndexPath's pathkeys to
> whatever (if any) sort order that the index provides.

Can we not added original path keys in IndexPath? It could be useful

at other places as well. Atleast, I can see it useful in sorting cases.

> makes me wonder if this entire patch is a good idea. 

We are still getting some benefit even without index paths for now.


I have attached patch with pathkeys_count_contained_in_unordered

and corresponding changes in create_final_distinct_paths for reference.


Thanks,

Ankit

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: almost-super-user problems that we haven't fixed yet
Next
From: Andrew Dunstan
Date:
Subject: Re: Extracting cross-version-upgrade knowledge from buildfarm client