Re: Hook for Selectivity Estimation in Query Planning - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: Hook for Selectivity Estimation in Query Planning
Date
Msg-id cff6d26c-9cb8-4818-93b3-f4da78688abf@gmail.com
Whole thread Raw
In response to Re: Hook for Selectivity Estimation in Query Planning  (Aleksander Alekseev <aleksander@timescale.com>)
List pgsql-hackers
On 5/3/2025 19:50, Aleksander Alekseev wrote:
> Andrei, Matthias,
> 
>> Could you explain why you think the Pluggable TOASTer proposal was similar?
>> [...]
> 
> I merely pointed out that adding hooks without any particular value
> for the Postgres users was criticized before, see for instance:
Thank you for your feedback. Rational criticism is always welcome. Let’s 
aim to clarify the actual objectives:
> 
> https://www.postgresql.org/message-id/20230206104917.sipa7nzue5lw2e6z%40alvherre.pgsql
> 
> One could argue - but wait, isn't TAM for instance just a bunch of
> hooks in a nutshell? How do we distinguish a well-documented and more
> or less stable API for the extension authors from a random hook put in
> a convenient place? That's a good question. I don't have an answer to
> it. This being said, the proposed patch doesn't strike me as a good or
> documented API, or the one that is going to be stable in the long run.
1. **Documentation** - Agreed. I think it's feasible to create 
documentation based on the examples. However, we should first decide on 
the main subject, don't you think?

2. **'Good API'** - I wouldn't say that makes sense. Could you clarify 
what you mean by "good API"? What qualifies as a good API, why do you 
feel that the current changes are bad, and how can we improve it?

3. **'Stable'** - Why do you believe it is unstable? As I mentioned, 
this is the first hook that allows us to influence the optimiser's 
behaviour. Current path hooks only allow us to provide the planner with 
alternative decisions and force us to think it knows better how to 
proceed. I suggest we enable developers to enhance prediction quality 
without having to create a new planner. The rationale behind this is 
quite clear — specific workloads may require more sophisticated 
estimation algorithms, which would be excessive for a general-purpose 
planner.

As you can imagine, I would like to hook into cardinality predictions or 
tweak cost functions (see Apache Calcite), but that approach is invasive 
and unstable since each node, whether existing or newly introduced, 
would require such a call. In contrast, the selectivity estimation 
function serves as a central point for estimations, necessitating only 
one call. I believe we could consider adding a reference to `RelOptInfo` 
in the future, as has been briefly mentioned in discussions among 
developers before. For now, though, this seems sufficient for the 
purpose of database statistics.
> 
>> [...]
>>
>> Overall, I see that new hooks allow new [sometimes] open-source projects
>> and startups to emerge - not sure about enterprises' benefits.
>> Therefore, I'm not convinced by your current justification. Are there
>> any technical objections?
> 
> There is no point in debating about good and evil or right and wrong.
> The only important question is whether there will be a committer
> willing to accept the proposed change considering its controversy.
It would be interesting to see what type of controversy you see here. I 
think it will be clearer after you answer the previous questions.

-- 
regards, Andrei Lepikhov



pgsql-hackers by date:

Previous
From: Nikhil Kumar Veldanda
Date:
Subject: ZStandard (with dictionaries) compression support for TOAST compression
Next
From: Florents Tselai
Date:
Subject: Re: jsonb_strip_nulls with arrays?