Hook for Selectivity Estimation in Query Planning - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Hook for Selectivity Estimation in Query Planning
Date
Msg-id d1867416-198b-418a-be53-7df4e10aec62@gmail.com
Whole thread Raw
Responses Re: Hook for Selectivity Estimation in Query Planning
List pgsql-hackers
Hi,

I would like to discuss the introduction of a hook for evaluating the 
selectivity of an expression when searching for an optimal query plan. 
This topic has been brought up in various discussions, for example, in [1].

Currently, extensions that interact with the optimiser can only add 
their paths without the ability to influence the optimiser's decisions. 
As a result, when developing an extension that implements a new type of 
statistics (such as a histogram for composite types), utilises knowledge 
from previously executed queries, or implements some system of 
selectivity hints, we find ourselves writing a considerable amount of 
code. To ensure the reliable operation of the extension, this may end up 
in developing a separate optimiser or, at the very least, creating a 
custom join search (refer to core.c in the pg_hint_plan extension for an 
estimation of the amount of code required).

A hook for evaluating selectivity could streamline the development of 
methods to improve selectivity evaluation, making it easier to create 
new types of statistics and estimation methods (I would like to deal 
with join clauses estimation). Considering the limited amount of code 
involved and the upcoming code freeze, I propose adding such a hook to 
PostgreSQL 18 to assess how it simplifies extension development.

This proposed hook would complement the existing path hooks without 
overlapping in functionality. In my experience with implementing 
adaptive features in enterprise solutions, I believe that additional 
hooks could also be beneficial for estimating the number of groups and 
the amount of memory allocated, which is currently based solely on 
work_mem. However, these suggestions do not interfere with the current 
proposal and could be considered later.

Critique:
In general, a hook for evaluating the number of rows appears to be a 
more promising approach. It would allow the extension to access specific 
RelOptInfo data, thus providing insights into where the evaluation takes 
place within the plan. Consequently, this would enable a deeper 
influence on the query plan choice. However, implementing such a hook 
might be more invasive, requiring modifications to each cost function. 
Additionally, it addresses a slightly different issue and can be 
considered separately.

Attached is a patch containing the proposed hook code.

-- 
regards, Andrei Lepikhov

Attachment

pgsql-hackers by date:

Previous
From: Corey Huinker
Date:
Subject: Re: Statistics Import and Export
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Selectively invalidate caches in pgoutput module