Re: allowing extensions to control planner behavior - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: allowing extensions to control planner behavior
Date
Msg-id deb87eba-d4e2-40ad-84e9-219a25516b2d@gmail.com
Whole thread Raw
In response to Re: allowing extensions to control planner behavior  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: allowing extensions to control planner behavior
List pgsql-hackers
On 10/23/24 15:05, Robert Haas wrote:
> On Sat, Oct 19, 2024 at 6:00 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
>> Generally, a hash value doesn't 100% guarantee the uniqueness of a node
>> identification. Also, RelOptInfo corresponds to a subtree in the final
>> plan, and sometimes, it takes work to find which node in the partially
>> executed plan corresponds to this specific estimation on row number
>> during selectivity estimation. Remember parameterised paths - you should
>> attach some signature for each path. So, it is not fully strict method.
>> If you are interested, I can perhaps explain the method a little bit
>> more at some meetup.
> 
> Yeah, I agree that this is not the best method. While it's true that
> you could get a false match in case of a hash value collision, IMHO
> the bigger problem is that it seems like an expensive way of
> determining something that we really should know already. If the user
> types the same query, mentioning the same relations, in the same
> order, with the same constructs around them, it's hard to believe that
> hashing is the cheapest way of matching up the old and new ones. I'm
> not sure exactly what we should do instead, but it feels like we more
> or less have this information during parsing and then we lose track of
> it as the query goes through the rewrite and planning phases.
Parse tree may be implemented with multiple execution plans. Even 
clauses can be transformed during optimisation (Remember OR -> ANY). 
Also, the cardinality of a middle-tree join depends on the inner and 
outer subtrees. Because of that, having a hash on RelOptInfo's relids 
and restrictions + hashes of child RelOptInfos and carrying it through 
all other stages up to the end of execution is the most stable approach 
I know.

-- 
regards, Andrei Lepikhov




pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Pgoutput not capturing the generated columns
Next
From: Amit Langote
Date:
Subject: Re: Remove unnecessary word in a comment