Re: PoC/WIP: Extended statistics on expressions - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: PoC/WIP: Extended statistics on expressions
Date
Msg-id 096d26e1-f053-a039-047d-a050dbb80ac6@enterprisedb.com
Whole thread Raw
In response to Re: PoC/WIP: Extended statistics on expressions  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Responses Re: PoC/WIP: Extended statistics on expressions  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers

On 3/26/21 12:37 PM, Dean Rasheed wrote:
> On Thu, 25 Mar 2021 at 19:59, Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>> Attached is an updated patch series, with all the changes discussed
>> here. I've cleaned up the ndistinct stuff a bit more (essentially
>> reverting back from GroupExprInfo to GroupVarInfo name), and got rid of
>> the UpdateStatisticsForTypeChange.
>>
> 
> I've looked over all that and I think it's in pretty good shape. I
> particularly like how much simpler the ndistinct code has now become.
> 
> Some (hopefully final) review comments:
> 
> 1). I don't think index.c is the right place for
> StatisticsGetRelation(). I appreciate that it is very similar to the
> adjacent IndexGetRelation() function, but index.c is really only for
> index-related code, so I think StatisticsGetRelation() should go in
> statscmds.c
> 

Ah, right, I forgot about this. I wonder if we should add
catalog/statistics.c, similar to catalog/index.c (instead of adding it
locally to statscmds.c).

> 2). Perhaps the error message at statscmds.c:293 should read
> 
>    "expression cannot be used in multivariate statistics because its
> type %s has no default btree operator class"
> 
> (i.e., add the word "multivariate", since the same expression *can* be
> used in univariate statistics even though it has no less-than
> operator).
> 
> 3). The comment for ATExecAddStatistics() should probably mention that
> "ALTER TABLE ADD STATISTICS" isn't a command in the grammar, in a
> similar way to other similar functions, e.g.:
> 
> /*
>  * ALTER TABLE ADD STATISTICS
>  *
>  * This is no such command in the grammar, but we use this internally to add
>  * AT_ReAddStatistics subcommands to rebuild extended statistics after a table
>  * column type change.
>  */
> 
> 4). The comment at the start of ATPostAlterTypeParse() needs updating
> to mention CREATE STATISTICS statements.
> 
> 5). I think ATPostAlterTypeParse() should also attempt to preserve any
> COMMENTs attached to statistics objects, i.e., something like:
> 
> --- src/backend/commands/tablecmds.c.orig    2021-03-26 10:39:38.328631864 +0000
> +++ src/backend/commands/tablecmds.c    2021-03-26 10:47:21.042279580 +0000
> @@ -12619,6 +12619,9 @@
>              CreateStatsStmt  *stmt = (CreateStatsStmt *) stm;
>              AlterTableCmd *newcmd;
> 
> +            /* keep the statistics object's comment */
> +            stmt->stxcomment = GetComment(oldId, StatisticExtRelationId, 0);
> +
>              newcmd = makeNode(AlterTableCmd);
>              newcmd->subtype = AT_ReAddStatistics;
>              newcmd->def = (Node *) stmt;
> 
> 6). Comment typo at extended_stats.c:2532 - s/statitics/statistics/
> 
> 7). I don't think that the big XXX comment near the start of
> estimate_multivariate_ndistinct() is really relevant anymore, now that
> the code has been simplified and we no longer extract Vars from
> expressions, so perhaps it can just be deleted.
> 

Thanks! I'll fix these, and then will consider getting it committed
sometime later today, once the buildfarm does some testing on the other
stuff I already committed.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: "Joel Jacobson"
Date:
Subject: Re: [PATCH] pg_permissions
Next
From: Tomas Vondra
Date:
Subject: Re: WIP: BRIN multi-range indexes