Re: PoC/WIP: Extended statistics on expressions - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: PoC/WIP: Extended statistics on expressions |
Date | |
Msg-id | e5beb5d5-28b4-7aad-3a16-9a6abf57f897@enterprisedb.com Whole thread Raw |
In response to | Re: PoC/WIP: Extended statistics on expressions (Justin Pryzby <pryzby@telsasoft.com>) |
Responses |
Re: PoC/WIP: Extended statistics on expressions
Re: PoC/WIP: Extended statistics on expressions |
List | pgsql-hackers |
On 11/23/20 3:26 AM, Justin Pryzby wrote: > On Sun, Nov 22, 2020 at 08:03:51PM +0100, Tomas Vondra wrote: >> attached is a significantly improved version of the patch, allowing >> defining extended statistics on expressions. This fixes most of the >> problems in the previous WIP version and AFAICS it does pass all >> regression tests (including under valgrind). There's a bunch of FIXMEs >> and a couple loose ends, but overall I think it's ready for reviews. > > I was looking at the previous patch, so now read this one instead, and attach > some proposed fixes. > > + * This matters especially for * expensive expressions, of course. > The point this was trying to make is that we evaluate the expressions only once, and use the results to build all extended statistics. Instead of leaving it up to every "build" to re-evaluate it. > + The expression can refer only to columns of the underlying table, but > + it can use all columns, not just the ones the statistics is defined > + on. > > I don't know what these are trying to say? > D'oh. That's bogus para, copied from the CREATE INDEX docs (where it talked about the index predicate, which is irrelevant here). > + errmsg("statistics expressions and predicates can refer only to the table being indexed"))); > + * partial-index predicates. Create it in the per-index context to be > > I think these are copied and shouldn't mention "indexes" or "predicates". Or > should statistics support predicates, too ? > Right. Stupid copy-pasto. > Idea: if a user specifies no stakinds, and there's no expression specified, > then you automatically build everything except for expressional stats. But if > they specify only one statistics "column", it gives an error. If that's a > non-simple column reference, should that instead build *only* expressional > stats (possibly with a NOTICE, since the user might be intending to make MV > stats). > Right, that was the intention - but I messed up and it works only if you specify the "expressions" kind explicitly (and I also added the ERROR message to expected output by mistake). I agree we should handle this automatically, so that CREATE STATISTICS s ON (a+b) FROM t works and only creates statistics for the expression. > I think pg_stats_ext should allow inspecting the pg_statistic data in > pg_statistic_ext_data.stxdexprs. I guess array_agg() should be ordered by > something, so maybe it should use ORDINALITY (?) > I agree we should expose the expression statistics, but I'm not convinced we should do that in the pg_stats_ext view itself. The problem is that it's a table bested in a table, essentially, with non-trivial structure, so I was thinking about adding a separate view exposing just this one part. Something like pg_stats_ext_expressions, with about the same structure as pg_stats, or something. > I hacked more on bootstrap.c so included that here. Thanks. As for the 0004-0007 patches: 0004 - Seems fine. IMHO not really "silly errors" but OK. 0005 - Mostly OK. The docs wording mostly comes from CREATE INDEX docs, though. The paragraph about "t1" is old, so if we want to reword it then maybe we should backpatch too. 0006 - Not sure. I think CreateStatistics can be fixed with less code, keeping it more like PG13 (good for backpatching). Not sure why rename extended statistics to multi-variate statistics - we use "extended" everywhere. Not sure what's the point of serialize_expr_stats changes, that's code is mostly copy-paste from update_attstats. 0007 - I suspect this makes the pg_stats_ext too complex to work with, IMHO we should move this to a separate view. Thanks for the review! I'll try to look more closely at those patches sometime next week, and merge most of it. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: