Re: [PATCH] Teach planner to further optimize sort in distinct - Mailing list pgsql-hackers

From David Rowley
Subject Re: [PATCH] Teach planner to further optimize sort in distinct
Date
Msg-id CAApHDvqf1HFTfdGQL5Hu61H2j+7S-DRwsw=3orcsiiW90eeuPQ@mail.gmail.com
Whole thread Raw
In response to [PATCH] Teach planner to further optimize sort in distinct  (Ankit Kumar Pandey <itsankitkp@gmail.com>)
Responses Re: [PATCH] Teach planner to further optimize sort in distinct
List pgsql-hackers
On Wed, 18 Jan 2023 at 08:27, Ankit Kumar Pandey <itsankitkp@gmail.com> wrote:
> There is bit confusion in wording here:
>
> "returns a List of pathkeys
> which are in keys1 but not in keys2 and NIL if keys2 has a pathkey
> that does not exist as a pathkey in keys1."
>
> You mean extract common keys without ordering right?

I think you should write a function like:

bool pathkeys_count_contained_in_unordered(List *keys1, List *keys2,
List **reorderedkeys, int *n_common)

which works very similarly to pathkeys_count_contained_in, but
populates *reorderedkeys so it contains all of the keys in keys1, but
put the matching ones in the same order as they are in keys2.  If you
find a keys2 that does not exist in keys1 then just add the additional
unmatched keys1 keys to *reorderedkeys.  Set *n_common to the number
of common keys excluding any that come after a key2 key that does not
exist as a key1 key.

You can just switch to using that function in
create_final_distinct_paths(). You'll need to consider if the query is
a DISTINCT ON query and not try the unordered version of the function
in that case.

I also just noticed that in build_index_paths() we'll leave the index
path's pathkeys empty if we deem the pathkeys as useless.  I'm not
sure what the repercussions of setting those to the return value of
build_index_pathkeys() if useful_pathkeys is otherwise empty.  It's
possible that truncate_useless_pathkeys() needs to be modified to
check if the pathkeys might be useful for DISTINCT, but now that I see
we don't populate the IndexPath's pathkeys when we deem them not
useful makes me wonder if this entire patch is a good idea. When I
thought about it I assumed that we always set IndexPath's pathkeys to
whatever (if any) sort order that the index provides.

David



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)
Next
From: tushar
Date:
Subject: Re: almost-super-user problems that we haven't fixed yet