Home > mailing lists

Re: Additional Statistics Hooks - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Additional Statistics Hooks
Date	March 13, 2018 01:44:32
Msg-id	2822.1520894672@sss.pgh.pa.us Whole thread Raw
In response to	Re: Additional Statistics Hooks (Mat Arye <mat@timescale.com>)
Responses	Re: Additional Statistics Hooks Re: Additional Statistics Hooks
List	pgsql-hackers

Tree view

Mat Arye <mat@timescale.com> writes:
> So the use-case is an analytical query like

> SELECT date_trunc('hour', time) AS MetricMinuteTs, AVG(value) as avg
> FROM hyper
> WHERE time >= '2001-01-04T00:00:00' AND time <= '2001-01-05T01:00:00'
> GROUP BY MetricMinuteTs
> ORDER BY MetricMinuteTs DESC;

> Right now this query will choose a much-less-efficient GroupAggregate plan
> instead of a HashAggregate. It will choose this because it thinks the
> number of groups
> produced here is 9,000,000 because that's the number of distinct time
> values there are.
> But, because date_trunc "buckets" the values there will be about 24 groups
> (1 for each hour).

While it would certainly be nice to have better behavior for that,
"add a hook so users who can write C can fix it by hand" doesn't seem
like a great solution.  On top of the sheer difficulty of writing a
hook function, you'd have the problem that no pre-written hook could
know about all available functions.  I think somehow we'd need a way
to add per-function knowledge, perhaps roughly like the protransform
feature.

            regards, tom lane

pgsql-hackers by date:

From: Andres Freund
Date: 13 March 2018, 01:28:07
Subject: Re: explain with costs in subselect.sql

From: Alvaro Herrera
Date: 13 March 2018, 01:50:40
Subject: Re: unique indexes on partitioned tables

Re: Additional Statistics Hooks - Mailing list pgsql-hackers

Previous

Next