Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2 - Mailing list pgsql-bugs

From David Rowley
Subject Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2
Date
Msg-id CAApHDvocZCUhM9W9mJ39d6oQz7ePKoqFnao_347mvC-A7QatcQ@mail.gmail.com
Whole thread Raw
In response to Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2  (Tomas Vondra <tomas@vondra.me>)
Responses Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2
List pgsql-bugs
On Fri, 11 Apr 2025 at 01:31, Tomas Vondra <tomas@vondra.me> wrote:
> I think estimate_multivariate_bucketsize() needs to be more careful
> about building the GroupVarInfo list - in particular, it needs to do the
> dance with examine_variable + add_unique_group_var + pull_var_clause,
> similar to estimate_num_groups() at line ~3532.

This should be documented to prevent future callers of
estimate_multivariate_ndistinct() from falling for this.

The attached aims to do this.  I also couldn't resist a few other improvements.

There are a few strange goings-ons in the code itself that I didn't
adjust. For example, in the first "foreach(lc2, *varinfos)" loop after
the "if (stats)", there's a "found" variable that gets set and used
for no apparent reason. I don't see why the "found = true;" doesn't
just "continue;". The variable would only be needed if there was some
inner loop and we couldn't use "continue".  I also can't make sense of
the following comment:

/*
* XXX Maybe we should allow searching the expressions even if we
* found an attribute matching the expression? That would handle
* trivial expressions like "(a)" but it seems fairly useless.
*/

Maybe it meant "matching the Var"?

The final loop to build the newlist also looks more complex than it
needs to be. The prior loop over *varinfos could have recorded the
matching GroupVarInfos in the list in a Bitmapset and that final loop
could become:

foreach(lc, *varinfos)
{
   if (!bms_is_member(foreach_current_index(lc), matched_varinfos))
        newlist = lappend(newlist, lfirst(lc));
}

David

Attachment

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #18893: Segfault during analyze pg_database
Next
From: PG Bug reporting form
Date:
Subject: BUG #18894: values of JLC_COLLATE and LC_CTYPE in the database have changed from Japanese_Japan.932 to ja-jp