Re: Additional Statistics Hooks - Mailing list pgsql-hackers

From Mat Arye
Subject Re: Additional Statistics Hooks
Date
Msg-id CADsUR0A4W3s1GNHDHAxfY29A-N9DP+sCJ1FCwaNF+Vj04LkiwQ@mail.gmail.com
Whole thread Raw
In response to Re: Additional Statistics Hooks  (Euler Taveira <euler@timbira.com.br>)
Responses Re: Additional Statistics Hooks
List pgsql-hackers


On Mon, Mar 12, 2018 at 2:52 PM, Euler Taveira <euler@timbira.com.br> wrote:
2018-03-12 14:03 GMT-03:00 Mat Arye <mat@timescale.com>:
> I have a question about statistics hooks. I am trying to teach the planner
> that when grouping by something like date_trunc('1 day', time) will produce
> a lot less rows than the number of distinct time values. I want to do that
> in an extension. The problem is that I don't see a way to make the
> get_relation_stats_hook work well fo that since by the time it's called you
> only see the `time` var and not the full expression. None of the other hooks
> seem appropriate either. So 2 questions:
>
Isn't it the case to extend the available hook?

> 1) Would people be opposed to adding a code hook somewhere at the start of
> `examine_variable` (selfuncs.c) to allow creating statistics on complete
> expressions? I can submit a patch if this seems reasonable.
>
If you explain the use case maybe it could be considered.

So the use-case is an analytical query like 

SELECT date_trunc('hour', time) AS MetricMinuteTs, AVG(value) as avg
FROM hyper
WHERE time >= '2001-01-04T00:00:00' AND time <= '2001-01-05T01:00:00'
GROUP BY MetricMinuteTs
ORDER BY MetricMinuteTs DESC;

Right now this query will choose a much-less-efficient GroupAggregate plan
instead of a HashAggregate. It will choose this because it thinks the number of groups
produced here is 9,000,000 because that's the number of distinct time values there are.
But, because date_trunc "buckets" the values there will be about 24 groups (1 for each hour).

 

> 2) Do patches that add code hooks (and are probably under 10 lines) need to
> go through the entire commitfest process. I guess what I am really asking is
> if PG12 would be the first version such a patch could appear in or is PG11
> still a possibility? Just wondering what the policy on such stuff is.
>
If it is a new feature and is not in the last CF, it won't be
considered for v11 (even small patches).


--
   Euler Taveira                                   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: JIT compiling with LLVM v11
Next
From: Christos Maris
Date:
Subject: Re: Google Summer of Code: Potential Applicant