Re: PoC/WIP: Extended statistics on expressions - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: PoC/WIP: Extended statistics on expressions
Date
Msg-id 20210117052954.GC8560@telsasoft.com
Whole thread Raw
In response to Re: PoC/WIP: Extended statistics on expressions  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: PoC/WIP: Extended statistics on expressions  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Sun, Jan 17, 2021 at 01:23:39AM +0100, Tomas Vondra wrote:
> diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
> index 4363be50c3..5b8eb8d248 100644
> --- a/doc/src/sgml/ref/create_statistics.sgml
> +++ b/doc/src/sgml/ref/create_statistics.sgml
> @@ -21,9 +21,13 @@ PostgreSQL documentation
>  
>   <refsynopsisdiv>
>  <synopsis>
> +CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="parameter">statistics_name</replaceable>
> +    ON ( <replaceable class="parameter">expression</replaceable> )
> +    FROM <replaceable class="parameter">table_name</replaceable>

>  CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="parameter">statistics_name</replaceable>
>      [ ( <replaceable class="parameter">statistics_kind</replaceable> [, ... ] ) ]
> -    ON <replaceable class="parameter">column_name</replaceable>, <replaceable
class="parameter">column_name</replaceable>[, ...]
 
> +    ON { <replaceable class="parameter">column_name</replaceable> | ( <replaceable
class="parameter">expression</replaceable>) } [, ...]
 
>      FROM <replaceable class="parameter">table_name</replaceable>
>  </synopsis>

I think this part is wrong, since it's not possible to specify a single column
in form#2.

If I'm right, the current way is:

 - form#1 allows expression statistics of a single expression, and doesn't
   allow specifying "kinds";

 - form#2 allows specifying "kinds", but always computes expression statistics,
   and requires multiple columns/expressions.

So it'd need to be column_name|expression, column_name|expression [,...]

> @@ -39,6 +43,16 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="parameter">statistics_na
>     database and will be owned by the user issuing the command.
>    </para>
>  
> +  <para>
> +   The <command>CREATE STATISTICS</command> command has two basic forms. The
> +   simple variant allows building statistics for a single expression, does
> +   not allow specifying any statistics kinds and provides benefits similar
> +   to an expression index. The full variant allows defining statistics objects
> +   on multiple columns and expressions, and selecting which statistics kinds will
> +   be built. The per-expression statistics are built automatically when there
> +   is at least one expression.
> +  </para>

> +   <varlistentry>
> +    <term><replaceable class="parameter">expression</replaceable></term>
> +    <listitem>
> +     <para>
> +      The expression to be covered by the computed statistics. In this case
> +      only a single expression is required, in which case only the expression
> +      statistics kind is allowed. The order of expressions is insignificant.

I think this part is wrong now ?
I guess there's no user-facing expression "kind" left in the CREATE command.
I guess "in which case" means "if only one expr is specified".
"expression" could be either form#1 or form#2.

Maybe it should just say:
> +      An expression to be covered by the computed statistics.

Maybe somewhere else, say:
> In the second form of the command, the order of expressions is insignificant.

-- 
Justin



pgsql-hackers by date:

Previous
From: "tsunakawa.takay@fujitsu.com"
Date:
Subject: RE: libpq debug log
Next
From: Dilip Kumar
Date:
Subject: Re: Is Recovery actually paused?