On Tue, Jan 14, 2020 at 04:21:57PM -0500, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On Tue, Jan 14, 2020 at 03:12:21PM -0500, Tom Lane wrote:
>>> cc'ing Tomas in case he has any thoughts about it.
>
>> Well, I certainly do thoughts about this - it's pretty much exactly what
>> I proposed yesterday in this thread:
>> https://www.postgresql.org/message-id/flat/20200113230008.g67iyk4cs3xbnjju@development
>> The third part of that patch series is exactly about supporting extended
>> statistics on expressions, about the way you described here. The current
>> status of the WIP patch is that grammar + ANALYZE mostly works, but
>> there is no support in the planner. It's obviously still very hackish.
>
>Cool. We should probably take the discussion to that thread, then.
>
>> I'm also wondering if we could/should 100% rely on extended statistics,
>> because those are really meant to track correlations between columns,
>
>Yeah, it seems likely to me that the infrastructure for this would be
>somewhat different --- the user-facing syntax could be basically the
>same, but ultimately we want to generate entries in pg_statistic not
>pg_statistic_ext_data. Or at least entries that look the same as what
>you could find in pg_statistic.
>
Yeah. I think we could invent a new type of statistics "expressions"
which would simply built this per-column stats. So for example
CREATE STATISTICS s (expressions) ON (a*b), sqrt(c) FROM t;
would build per-column stats stored in pg_statistics, while
CREATE STATISTICS s (mcv) ON (a*b), sqrt(c) FROM t;
would build the multi-column MCV list on expressions.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services