Thread: BUG #18885: ERROR: corrupt MVNDistinct entry - 2

BUG #18885: ERROR: corrupt MVNDistinct entry - 2

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      18885
Logged by:          Robins Tharakan
Email address:      tharakan@gmail.com
PostgreSQL version: Unsupported/Unknown
Operating system:   Ubuntu
Description:

The following SQL triggers "ERROR: corrupt MVNDistinct entry", however this
seems to be unrelated to a recent bugfix[1] for a similar issue (thus '2' in
the $SUBJECT).

The surfacing commit appears to be 6bb6a62f3cc45624c601d5270673a17447734629
[2].

SQL
===
$ cat ../sqith/repro.sql
CREATE TABLE a(b BIT VARYING, d int4range);
CREATE TABLE e(LIKE a);
CREATE TABLE f(LIKE a);
INSERT INTO f(d, b) VALUES('[0,0)', '');
CREATE STATISTICS ON d, b FROM f;
ANALYZE;
SELECT FROM f RIGHT JOIN e ON (e.d=f.d)AND(f.d=e.d);


Commit
======
Testing for start crashing
---
Checking (306dd6e727b~0) - 306dd6e727 - fail (1)
Checking (306dd6e727b~10) - 91f1fe90c7 - fail (1)
Checking (306dd6e727b~30) - 969ab9d4f5 - fail (1)
Checking (306dd6e727b~70) - 1495eff7bd - fail (1)
.
.
Checking (306dd6e727b~447) - fae535da0a - pass (0)
Checking (306dd6e727b~446) - 6bb6a62f3c - fail (1)
Surfacing Commit is 6bb6a62f3cc45624c601d5270673a17447734629



SQL Output
==========
$ psql -p 9999 postgres -f ../sqith/repro.sql
CREATE TABLE
CREATE TABLE
CREATE TABLE
INSERT 0 1
CREATE STATISTICS
ANALYZE
psql:../sqith/repro.sql:7: ERROR:  corrupt MVNDistinct entry

Found using SQLSmith / creduce.
-
robins
https://robins.in

Reference
1. Similar bugfix:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e28033fe1af8037e0fec8bb3a32fabbe18ac06b1
2. Surfacing commit:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=6bb6a62f3cc45624c601d5270673a17447734629


Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2

From
Tomas Vondra
Date:
On 4/9/25 15:46, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      18885
> Logged by:          Robins Tharakan
> Email address:      tharakan@gmail.com
> PostgreSQL version: Unsupported/Unknown
> Operating system:   Ubuntu
> Description:        
> 
> The following SQL triggers "ERROR: corrupt MVNDistinct entry", however this
> seems to be unrelated to a recent bugfix[1] for a similar issue (thus '2' in
> the $SUBJECT).
> 
> The surfacing commit appears to be 6bb6a62f3cc45624c601d5270673a17447734629
> [2].
> 

Thanks for the report. I only looked at this very briefly, but I agree
this seems to be a bug in 6bb6a62f3cc45624c601d5270673a17447734629,
because it simply ignores the possibility that multiple join clauses
could share the same Var. Which is the case in the reproducer, where the
join clause is simply

    ... ON (e.d=f.d) AND (e.d=f.d);

Which then leads to confusion when matching the MVDistinct entries.

I think estimate_multivariate_bucketsize() needs to be more careful
about building the GroupVarInfo list - in particular, it needs to do the
dance with examine_variable + add_unique_group_var + pull_var_clause,
similar to estimate_num_groups() at line ~3532.


regards

-- 
Tomas Vondra




Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2

From
Andrei Lepikhov
Date:
On 4/9/25 15:46, PG Bug reporting form wrote:
> The following SQL triggers "ERROR: corrupt MVNDistinct entry", however this
> seems to be unrelated to a recent bugfix[1] for a similar issue (thus '2' in
> the $SUBJECT).
That's my fault.
We wanted to avoid using of the add_unique_group_var, but forgot it does 
some necessary stuff.
Here, I attempt to use this routine in the hash join bucket size 
estimation. I transformed it a little, made it more general. Not sure it 
is the best design, but it is debatable.

-- 
regards, Andrei Lepikhov
Attachment

Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2

From
Alexander Korotkov
Date:
Hi, Andrei!

On Thu, Apr 10, 2025 at 3:28 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>
> On 4/9/25 15:46, PG Bug reporting form wrote:
> > The following SQL triggers "ERROR: corrupt MVNDistinct entry", however this
> > seems to be unrelated to a recent bugfix[1] for a similar issue (thus '2' in
> > the $SUBJECT).
> That's my fault.
> We wanted to avoid using of the add_unique_group_var, but forgot it does
> some necessary stuff.
> Here, I attempt to use this routine in the hash join bucket size
> estimation. I transformed it a little, made it more general. Not sure it
> is the best design, but it is debatable.

Thank you for the fix, but it's not enough.

If you would replace a new regression tests query with this, it would
fail with an assertion.

SELECT FROM sb_1 LEFT JOIN sb_2 ON (sb_2.x=sb_1.x) AND (sb_1.x=sb_2.x)
AND (sb_1.y=sb_2.y);

When you use add_unique_group_var() which keeps varinfos unique then
you can no longer expect that varinfos have the same order as
origin_rinfos.

------
Regards,
Alexander Korotkov
Supabase



Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2

From
David Rowley
Date:
On Fri, 11 Apr 2025 at 01:31, Tomas Vondra <tomas@vondra.me> wrote:
> I think estimate_multivariate_bucketsize() needs to be more careful
> about building the GroupVarInfo list - in particular, it needs to do the
> dance with examine_variable + add_unique_group_var + pull_var_clause,
> similar to estimate_num_groups() at line ~3532.

This should be documented to prevent future callers of
estimate_multivariate_ndistinct() from falling for this.

The attached aims to do this.  I also couldn't resist a few other improvements.

There are a few strange goings-ons in the code itself that I didn't
adjust. For example, in the first "foreach(lc2, *varinfos)" loop after
the "if (stats)", there's a "found" variable that gets set and used
for no apparent reason. I don't see why the "found = true;" doesn't
just "continue;". The variable would only be needed if there was some
inner loop and we couldn't use "continue".  I also can't make sense of
the following comment:

/*
* XXX Maybe we should allow searching the expressions even if we
* found an attribute matching the expression? That would handle
* trivial expressions like "(a)" but it seems fairly useless.
*/

Maybe it meant "matching the Var"?

The final loop to build the newlist also looks more complex than it
needs to be. The prior loop over *varinfos could have recorded the
matching GroupVarInfos in the list in a Bitmapset and that final loop
could become:

foreach(lc, *varinfos)
{
   if (!bms_is_member(foreach_current_index(lc), matched_varinfos))
        newlist = lappend(newlist, lfirst(lc));
}

David

Attachment

Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2

From
Andrei Lepikhov
Date:
On 4/12/25 00:30, Alexander Korotkov wrote:
> On Thu, Apr 10, 2025 at 3:28 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>> ...
>> Here, I attempt to use this routine in the hash join bucket size
>> estimation. I transformed it a little, made it more general. Not sure it
>> is the best design, but it is debatable.
> ...
> SELECT FROM sb_1 LEFT JOIN sb_2 ON (sb_2.x=sb_1.x) AND (sb_1.x=sb_2.x)
> AND (sb_1.y=sb_2.y);
> 
> When you use add_unique_group_var() which keeps varinfos unique then
> you can no longer expect that varinfos have the same order as
> origin_rinfos.
Ok, here is a patch that considers this issue. Now GroupVarInfo tracks 
source RestrictInfo. Not sure it is an ideal approach, but we don't need 
to synchronise the restrictions and corresponding varinfos.

On Fri, 11 Apr 2025 at 01:31, Tomas Vondra <tomas@vondra.me> wrote:
 > I think estimate_multivariate_bucketsize() needs to be more careful
 > about building the GroupVarInfo list - in particular, it needs to do
 > dance with examine_variable + add_unique_group_var + pull_var_clause,
 > similar to estimate_num_groups() at line ~3532.
Yeah, estimate_num_groups and bucket size estimation have a lot in 
common. It would be better to invent some common GroupVarInfo 
preparation/estimation code for them, but specifics of HashJoin bucket 
estimation need mcv_freq and result caching that limits intersection of 
these estimators.

-- 
regards, Andrei Lepikhov
Attachment

Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2

From
Andrei Lepikhov
Date:
On 4/14/25 01:26, David Rowley wrote:
> just "continue;". The variable would only be needed if there was some
> inner loop and we couldn't use "continue".
+1
> * XXX Maybe we should allow searching the expressions even if we
> * found an attribute matching the expression? That would handle
> * trivial expressions like "(a)" but it seems fairly useless.
> */
> Maybe it meant "matching the Var"?
+1

I have read your modification of comments to 
estimate_multivariate_ndistinct. It suggested the idea of letting the 
GroupVarInfo struct be exported. For now, only add_unique_group_var or 
another internal selfuncs.c routine may build this list.
I'm not sure that the GroupVarInfo abstraction is shaped ideally at the 
moment, but it seems it may perform duties similar to RestrictInfo but 
for grouping keys: store the key, cache data about this key, cache 
statistics and estimations.
It would be the first step in letting modules use 
estimate_multivariate_ndistinct. Likewise, we can use 
statext_clauselist_selectivity and dependencies_clauselist_selectivity.

What do you think about that?

P.S. statext_mcv_clauselist_selectivity deserves to be exported, too.

-- 
regards, Andrei Lepikhov



Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2

From
Tomas Vondra
Date:

On 4/15/25 11:41, Andrei Lepikhov wrote:
> On 4/14/25 01:26, David Rowley wrote:
>> just "continue;". The variable would only be needed if there was some
>> inner loop and we couldn't use "continue".
> +1
>> * XXX Maybe we should allow searching the expressions even if we
>> * found an attribute matching the expression? That would handle
>> * trivial expressions like "(a)" but it seems fairly useless.
>> */
>> Maybe it meant "matching the Var"?
> +1
> 
> I have read your modification of comments to
> estimate_multivariate_ndistinct. It suggested the idea of letting the
> GroupVarInfo struct be exported. For now, only add_unique_group_var or
> another internal selfuncs.c routine may build this list.
> I'm not sure that the GroupVarInfo abstraction is shaped ideally at the
> moment, but it seems it may perform duties similar to RestrictInfo but
> for grouping keys: store the key, cache data about this key, cache
> statistics and estimations.
> It would be the first step in letting modules use
> estimate_multivariate_ndistinct. Likewise, we can use
> statext_clauselist_selectivity and dependencies_clauselist_selectivity.
> 
> What do you think about that?
> 
> P.S. statext_mcv_clauselist_selectivity deserves to be exported, too.
> 

Not sure we'd want to export GroupVarInfo in the current shape. It was
meant for a short-lived local state (within a single function), which
seems quite different from what RestrictInfo does.

In any case, such rework seems out of scope for PG18 - it's much more
than a fix for the bug.


regards

-- 
Tomas Vondra




Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2

From
David Rowley
Date:
On Wed, 16 Apr 2025 at 01:16, Tomas Vondra <tomas@vondra.me> wrote:
> In any case, such rework seems out of scope for PG18 - it's much more
> than a fix for the bug.

This is the reason I stopped short of adjusting the code in the patch
I proposed in [1].  Speaking of which, do you have any views on that?
It seems much more likely that someone would have caught the issue
being reported here when writing the patch or during review if there
was some sort of documentation to mention the function can't handle
duplicate GroupVarInfos.

I'll look at that patch again in the next few days with aims to push
it unless someone has something to say about it.

David

[1] https://postgr.es/m/CAApHDvocZCUhM9W9mJ39d6oQz7ePKoqFnao_347mvC-A7QatcQ@mail.gmail.com



Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2

From
Andrei Lepikhov
Date:
On 4/15/25 15:16, Tomas Vondra wrote:
> On 4/15/25 11:41, Andrei Lepikhov wrote:
>> What do you think about that? 
> Not sure we'd want to export GroupVarInfo in the current shape. It was
> meant for a short-lived local state (within a single function), which
> seems quite different from what RestrictInfo does.
> 
> In any case, such rework seems out of scope for PG18 - it's much more
> than a fix for the bug.
Of course, it is not for PG 18. I'm working on further GROUP-BY 
optimisation where columns are re-arranged in a way to put at the first 
position column with maximum ndistinct estimation. Additional work is 
needed that may be compensated by distinct caching.
One more ongoing project - correcting ndistinct estimation by passing 
through the EquivalenceClass and looking for a minimum ndistinct. It 
also needs a kinda cache. That's why I am pondering how to redefine 
GroupVarInfo to reduce estimation efforts.

-- 
regards, Andrei Lepikhov



Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2

From
Andrei Lepikhov
Date:
On 4/14/25 16:41, Andrei Lepikhov wrote:
> On 4/12/25 00:30, Alexander Korotkov wrote:
>> When you use add_unique_group_var() which keeps varinfos unique then
>> you can no longer expect that varinfos have the same order as
>> origin_rinfos.
> Ok, here is a patch that considers this issue. Now GroupVarInfo tracks 
> source RestrictInfo. Not sure it is an ideal approach, but we don't need 
> to synchronise the restrictions and corresponding varinfos.
This is the new version of the patch.
I don't like the previous version because it was too invasive. Also, 
add_unique_group_var checks "known equal" keys. It is not applicable in 
the current case, of course, but it still seems suspicious: equality 
expressions may be applied at an upper level of the join tree, and 
equality doesn't exist at this specific place yet. Being correct in the 
case of GROUP-BY operator estimation, such conjecture may distort 
estimation in the middle of the join tree.
With the current patch, we just stay away from that doubtful code.

-- 
regards, Andrei Lepikhov
Attachment