Re: allowing extensions to control planner behavior - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: allowing extensions to control planner behavior
Date
Msg-id e3065f7b-e940-4659-acfb-2f8da8308eb6@gmail.com
Whole thread Raw
In response to Re: allowing extensions to control planner behavior  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 26/8/2024 21:44, Robert Haas wrote:
> On Mon, Aug 26, 2024 at 2:00 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>> My personal most wanted list:
>> - Selectivity list estimation hook
>> - Groups number estimation hook
>> - hooks on memory estimations, involving work_mem
>> - add_path() hook
>> - Hook on final RelOptInfo pathlist
>> - a custom list of nodes in RelOptinfo, PlannerStmt, Plan and Query
>> structures
>> - Extensibility of extended and plain statistics
>> - Hook on portal error processing
>> - Canonicalise expressions hook
> 
> One of my chronic complaints about hooks is that people propose hooks
> that are just in any random spot in the code where they happen to want
> to change something. If we accept that, we end up with a lot of hooks
> where nobody can say how the hook can be used usefully and maybe it
> can't actually be used usefully even by the original author, or only
> them and nobody else. So these kinds of proposals need detailed,
> case-by-case scrutiny. It's unacceptable for the planner to get filled
> up with a bunch of poorly-designed hooks just as it is for any other
> part of the system, but well-designed hooks whose usefulness can
> clearly be seen should be just as welcome here as anywhere else.
Definitely so. Think about that as a sketch proposal on the roadmap. 
Right now, I know about only one hook - selectivity hook - which we 
already discussed and have Tomas Vondra's patch on the table. But even 
this is a big deal, because multi-clause estimations are a huge pain for 
users that can't be resolved with extensions for now without core patches.

>> IMO, it is better not to switch on/off algorithms, but allow extensions
>> to change their cost multipliers, modifying costs balance. 10E9 looks
>> like a disable, but multiplier == 10 for a cost node just provide more
>> freedom for hashing strategies.
> 
> That may be a valid use case, but I do not think it is a typical use
> case. In my experience, when people want to force the planner to do
> something, they really mean it. They don't mean "please do it this way
> unless you really, really don't feel like it." They mean "please do it
> this way, period." And that is also what other systems provide. Oracle
> could provide a hint MERGE_COST(foo,10) meaning make merge joins look
> ten times as expensive but in fact they only provide MERGE and
> NO_MERGE. And a "reproduce this previous plan" feature really demands
> infrastructure that truly forces the planner to do what it's told,
> rather than just nicely suggesting that it might want to do as it's
> told. I wouldn't be sad at all if we happen to end up with a system
> that's powerful enough for an extension to implement "make merge joins
> ten times as expensive"; in fact, I think that would be pretty cool.
> But I don't think it should be the design center for what we
> implement, because it looks nothing like what existing PG or non-PG
> systems do, at least in my experience.
Heh, I meant not manual usage, but automatical one, provided by extensions.

-- 
regards, Andrei Lepikhov




pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Index AM API cleanup
Next
From: Jeff Davis
Date:
Subject: Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM