Re: PoC/WIP: Extended statistics on expressions - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: PoC/WIP: Extended statistics on expressions
Date
Msg-id 20210901193816.GO26465@telsasoft.com
Whole thread Raw
In response to Re: PoC/WIP: Extended statistics on expressions  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: PoC/WIP: Extended statistics on expressions  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Wed, Sep 01, 2021 at 06:45:29PM +0200, Tomas Vondra wrote:
> > > Patch 0001 fixes the "double parens" issue discussed elsewhere in this
> > > thread, and patch 0002 tweaks CREATE STATISTICS to treat "(a)" as a simple
> > > column reference.
> > 
> > 0002 refuses to create expressional stats on a simple column reference like
> > (a), which I think is helps to avoid a user accidentally creating useless ext
> > stats objects (which are redundant with the table's column stats).
> > 
> > 0002 does not attempt to refuse cases like (a+0), which I think is fine:
> > we don't try to reject useless cases if someone insists on it.
> > See 240971675, 701fd0bbc.
> > 
> > So I am +1 to apply both patches.
> > 
> > I added this as an Opened Item for increased visibility.
> 
> I've pushed both fixes, so the open item should be resolved.

Thank you - I marked it as such.

There are some typos in 537ca68db (refenrece)
I'll add them to my typos branch if you don't want to patch them right now or
wait to see if someone notices anything else.

diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 59369f8736..17cbd97808 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -205,27 +205,27 @@ CreateStatistics(CreateStatsStmt *stmt)
     numcols = list_length(stmt->exprs);
     if (numcols > STATS_MAX_DIMENSIONS)
         ereport(ERROR,
                 (errcode(ERRCODE_TOO_MANY_COLUMNS),
                  errmsg("cannot have more than %d columns in statistics",
                         STATS_MAX_DIMENSIONS)));
 
     /*
      * Convert the expression list to a simple array of attnums, but also keep
      * a list of more complex expressions.  While at it, enforce some
      * constraints - we don't allow extended statistics on system attributes,
-     * and we require the data type to have less-than operator.
+     * and we require the data type to have a less-than operator.
      *
-     * There are many ways how to "mask" a simple attribute refenrece as an
+     * There are many ways to "mask" a simple attribute reference as an
      * expression, for example "(a+0)" etc. We can't possibly detect all of
-     * them, but we handle at least the simple case with attribute in parens.
+     * them, but we handle at least the simple case with the attribute in parens.
      * There'll always be a way around this, if the user is determined (like
      * the "(a+0)" example), but this makes it somewhat consistent with how
      * indexes treat attributes/expressions.
      */
     foreach(cell, stmt->exprs)
     {
         StatsElem  *selem = lfirst_node(StatsElem, cell);
 
         if (selem->name)        /* column reference */
         {
             char       *attname;



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: mark the timestamptz variant of date_bin() as stable
Next
From: Tomas Vondra
Date:
Subject: Re: PoC/WIP: Extended statistics on expressions