Re: PoC/WIP: Extended statistics on expressions - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: PoC/WIP: Extended statistics on expressions
Date
Msg-id 20210108023537.GA19743@telsasoft.com
Whole thread Raw
In response to Re: PoC/WIP: Extended statistics on expressions  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: PoC/WIP: Extended statistics on expressions  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Fri, Jan 08, 2021 at 01:57:29AM +0100, Tomas Vondra wrote:
> Attached is a patch fixing most of the issues. There are a couple
> exceptions:

In the docs:

+   — at the cost that its schema must be extended whenever the structure
                                                                                        
 
+   of statistics <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link> changes.
                                                                                        
 

should say "of statistics *IN* pg_statistics changes" ?

+   to an expression index. The full variant allows defining statistics objects
                                                                                        
 
+   on multiple columns and expressions, and pick which statistics kinds will
                                                                                        
 
+   be built. The per-expression statistics are built automatically when there
                                                                                        
 

"and pick" is wrong - maybe say "and selecting which.."

+   and run a query using an expression on that column.  Without the
                                                                                        
 

remove "the" ?

+   extended statistics, the planner has no information about data
                                                                                        
 
+   distribution for reasults of those expression, and uses default
                                                                                        
 

*results

+   estimates as illustrated by the first query.  The planner also does
                                                                                        
 
+   not realize the value of the second column fully defines the value
                                                                                        
 
+   of the other column, because date truncated to day still identifies
                                                                                        
 
+   the month). Then expression and ndistinct statistics are built on
                                                                                        
 

The ")" is unbalanced

+                               /* all parts of thi expression are covered by this statistics */
                                                                                        
 

this

+ * GrouExprInfos, but only if it's not known equal to any of the existing
                                                                                        
 

Group

+        * we don't allow specifying any statistis kinds.  The simple variant
                                                                                        
 

statistics

+        * If no statistic type was specified, build them all (but request
                                                                                        
 

Say "kind" not "type" ?

+ * expression is a simple Var. OTOH we check that there's at least one
                                                                                        
 
+ * statistics matching the expression.
                                                                                        
 

one statistic (singular) ?

+                * the future, we might consider
                                                                                        
 
+                */
                                                                                        
 

consider ???

+-- (not it fails, when there are no simple column references)
                                                                                        
 

note?

There's some remaining copy/paste stuff from index expressions:

errmsg("statistics expressions and predicates can refer only to the table being indexed")));
left behind by evaluating the predicate or index expressions.
Set up for predicate or expression evaluation
Need an EState for evaluation of index expressions and
/* Compute and save index expression values */
left behind by evaluating the predicate or index expressions.
Fetch function for analyzing index expressions.
partial-index predicates.  Create it in the per-index context to be
* When analyzing an expression index, believe the expression tree's type
                                                                               
 

-- 
Justin



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Incorrect allocation handling for cryptohash functions with OpenSSL
Next
From: Amit Kapila
Date:
Subject: Re: Single transaction in the tablesync worker?