Re: Hook for Selectivity Estimation in Query Planning - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: Hook for Selectivity Estimation in Query Planning
Date
Msg-id 478ee7e5-6ab9-4fe5-b782-e1210e0e3db5@gmail.com
Whole thread Raw
In response to Re: Hook for Selectivity Estimation in Query Planning  (Aleksander Alekseev <aleksander@timescale.com>)
Responses Re: Hook for Selectivity Estimation in Query Planning
List pgsql-hackers
On 5/3/2025 14:29, Aleksander Alekseev wrote:
> Hi,
> 
>> I would like to discuss the introduction of a hook for evaluating the
>> selectivity of an expression when searching for an optimal query plan.
>> This topic has been brought up in various discussions, for example, in [1].
>>
>> [...]
> 
> As I vaguely recall recent proposals like this ("Pluggable TOASTer" to
> name one) this approach was criticised. Hooks per se don't add value
> for the end user. They only put the burden of maintaining them on the
> community while all the real features are implemented in proprietar
> extensions. If you believe something is missing in Postgres,
> contribute it to the upstream so that anyone will benefit from it.
At first, I didn't find the reason for hooks' current existence in the 
core. However, it's clear that hooks speed up the development of 
extensions, which in turn enhances usability and popularity of the 
project. This leads to a greater number of use cases and tests, 
fostering community growth. I'm not sure what the purpose of the project 
is except curiosity, but even then, extensions speed up the idea 
validation process, don't they?
It's important to remember that not all extensions are proprietary. Does 
TimescaleDB not provide value to both end users and the community?

Furthermore, extensions are necessary to address gaps that the community 
may not work on by definition; for example, consider pg_hint_plan.

As I mentioned, the primary purpose of the hook is clear: to advance the 
development of alternative statistics and estimation methods. For 
instance, I've already come across proposals for multidimensional 
histograms. Personally, I want to use this hook to implement zonal 
ndistinct statistic extension to address the intra-column data skew issue.

Overall, I see that new hooks allow new [sometimes] open-source projects 
and startups to emerge - not sure about enterprises' benefits. 
Therefore, I'm not convinced by your current justification. Are there 
any technical objections?

-- 
regards, Andrei Lepikhov



pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: Hook for Selectivity Estimation in Query Planning
Next
From: Israel Barth Rubio
Date:
Subject: Re: Add -k/--link option to pg_combinebackup