On Wed, Oct 13, 2021 at 05:20:56PM +0000, Bossart, Nathan wrote:
> AFAICT the fact that these commands can succeed at all seems to be
> unintentional, and I wonder if modifying these options requires extra
> steps such as rebuilding the index.
I was looking at all this business with more attention, and this code
block is standing out in analyze.c:
/*
* Now we can compute the statistics for the expression columns.
*/
if (numindexrows > 0)
{
MemoryContextSwitchTo(col_context);
for (i = 0; i < attr_cnt; i++)
{
VacAttrStats *stats = thisdata->vacattrstats[i];
AttributeOpts *aopt =
get_attribute_options(stats->attr->attrelid,
stats->attr->attnum);
stats->exprvals = exprvals + i;
stats->exprnulls = exprnulls + i;
stats->rowstride = attr_cnt;
stats->compute_stats(stats,
ind_fetch_func,
numindexrows,
totalindexrows);
/*
* If the n_distinct option is specified, it overrides the
* above computation. For indices, we always use just
* n_distinct, not n_distinct_inherited.
*/
if (aopt != NULL && aopt->n_distinct != 0.0)
stats->stadistinct = aopt->n_distinct;
MemoryContextResetAndDeleteChildren(col_context);
}
}
When computing statistics on an index expression, this code means that
we would grab the value of n_distinct from the *index* if set and
force the stats to use it, and not use what the parent table has. For
example, say:
create table aa (a int);
insert into aa values (generate_series(1,1000));
create index aai on aa((a+a)) where a > 500;
alter index aai alter column expr set (n_distinct = 2);
analyze aa; -- n_distinct forced to 2.0 for the index stats
This code comes from 76a47c0 back in 2010. In PG <= 12, this would
work, but that does not as of 13~. Enforcing n_distinct for index
attributes was discussed back when this code was introduced:
https://www.postgresql.org/message-id/603c8f071001101127w3253899vb3f3e15073638774@mail.gmail.com
This means that we've lost the ability to enforce n_distinct for
expression indexes for two years. But, do we really care about this
case? My answer to that would be "no" as long as we don't have a
documented grammar rather, and we don't dump them either. But I think
that we'd better do something with the code in analyze.c rather than
letting it just dead, and my take is that we should remove the call to
get_attribute_options() for this code path.
Any opinions? @Robert: you were involved in 76a47c0, so I am adding
you in CC.
--
Michael