Re: PoC/WIP: Extended statistics on expressions - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: PoC/WIP: Extended statistics on expressions
Date
Msg-id 20210116232208.GB8560@telsasoft.com
Whole thread Raw
In response to Re: PoC/WIP: Extended statistics on expressions  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: PoC/WIP: Extended statistics on expressions  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Sat, Jan 16, 2021 at 05:48:43PM +0100, Tomas Vondra wrote:
> +      <entry role="catalog_table_entry"><para role="column_definition">
> +       <structfield>expr</structfield> <type>text</type>
> +      </para>
> +      <para>
> +       Expression the extended statistics is defined on
> +      </para></entry>

Expression the extended statistics ARE defined on
Or maybe say "on which the extended statistics are defined"

> +  <para>
> +   The <command>CREATE STATISTICS</command> command has two basic forms. The
> +   simple variant allows to build statistics for a single expression, does

.. ALLOWS BUILDING statistics for a single expression, AND does (or BUT does)

> +   Expression statistics are per-expression and are similar to creating an
> +   index on the expression, except that they avoid the overhead of the index.

Maybe say "overhead of index maintenance"

> +   All functions and operators used in a statistics definition must be
> +   <quote>immutable</quote>, that is, their results must depend only on
> +   their arguments and never on any outside influence (such as
> +   the contents of another table or the current time).  This restriction

say "outside factor" or "external factor"

> +   results of those expression, and uses default estimates as illustrated
> +   by the first query.  The planner also does not realize the value of the

realize THAT

> +   second column fully defines the value of the other column, because date
> +   truncated to day still identifies the month. Then expression and
> +   ndistinct statistics are built on those two columns:

I got an error doing this:

CREATE TABLE t AS SELECT generate_series(1,9) AS i;
CREATE STATISTICS s ON (i+1) ,(i+1+0) FROM t;
ANALYZE t;
SELECT i+1 FROM t GROUP BY 1;
ERROR:  corrupt MVNDistinct entry

-- 
Justin



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: New Table Access Methods for Multi and Single Inserts
Next
From: Tomas Vondra
Date:
Subject: Re: list of extended statistics on psql