Thread: ERROR: corrupt MVNDistinct entry
I ran into this error in estimate_multivariate_ndistinct, and it can be reproduced with the query below. create table t (a int, b int); insert into t select 1, 1; create statistics s (ndistinct) on a, b from t; analyze; explain select 1 from t t1 left join (select a c1, coalesce(a) c2 from t t2) s on true group by s.c1, s.c2; ERROR: corrupt MVNDistinct entry And the first bad commit is: 2489d76c4906f4461a364ca8ad7e0751ead8aa0d is the first bad commit commit 2489d76c4906f4461a364ca8ad7e0751ead8aa0d Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Mon Jan 30 13:16:20 2023 -0500 Make Vars be outer-join-aware. So in this query, there are two grouping expressions: s.c1 is Var t2.a with nullingrels set to {3}; s.c2 is a PHV with nullingrels also being {3}, and its contained expression is Var t2.a with empty nullingrels. This eventually leads to estimate_num_groups creating two separate GroupVarInfos for Var t2.a: one with nullingrels {3}, and another with empty nullingrels. As a result, estimate_multivariate_ndistinct incorrectly assumes there are two matching expressions. When it later fails to find the exact match for the combination, it mistakenly concludes that there is a corrupt MVNDistinct entry. It seems to me that when estimating the number of groups, we do not need to concern ourselves with the outer joins that could null the Vars/PHVs contained in the grouping expressions, and we should not count the same Var more than once. So I wonder if we can fix this issue by removing the nullingrels within the grouping expressions first in estimate_num_groups. Such as: --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -121,6 +121,7 @@ #include "parser/parse_clause.h" #include "parser/parse_relation.h" #include "parser/parsetree.h" +#include "rewrite/rewriteManip.h" #include "statistics/statistics.h" #include "storage/bufmgr.h" #include "utils/acl.h" @@ -3446,6 +3447,10 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows, if (groupExprs == NIL || (pgset && *pgset == NIL)) return 1.0; + groupExprs = (List *) remove_nulling_relids((Node *) groupExprs, + root->outer_join_rels, + NULL); + /* * Count groups derived from boolean grouping expressions. For other * expressions, find the unique Vars used, treating an expression as a Var Any thoughts? Thanks Richard
On 12/24/24 15:00, Richard Guo wrote: > Any thoughts? I have a couple of notes. 1. The nulling_relids provides us sensible information about possible nulls inside the input. We are not using it to estimate the number of such nulls for now. Does Your idea consist of obtaining 'clear' statistics and reusing nulling_relids knowledge somewhere later? 2. It is ok for Vars. But what about expressions? We use equal() in distinct, MCV and dependencies modules. Do we need to remove nulls before using extended statistics as a general rule? -- regards, Andrei Lepikhov
On Wed, Dec 25, 2024 at 11:34 AM Andrei Lepikhov <lepihov@gmail.com> wrote: > I have a couple of notes. > 1. The nulling_relids provides us sensible information about possible > nulls inside the input. We are not using it to estimate the number of > such nulls for now. Does Your idea consist of obtaining 'clear' > statistics and reusing nulling_relids knowledge somewhere later? Are you referring to the nullfrac estimates? A RelOptInfo's nulling_relids records all above outer joins that can null this rel. However, I cannot see how it helps with nullfrac. > 2. It is ok for Vars. But what about expressions? We use equal() in > distinct, MCV and dependencies modules. Do we need to remove nulls > before using extended statistics as a general rule? AFAIU, the expressions in extended statistics are not decorated with any nullingrels bits, are they? Thanks Richard
On Wed, Dec 25, 2024 at 5:14 PM Richard Guo <guofenglinux@gmail.com> wrote: > On Wed, Dec 25, 2024 at 11:34 AM Andrei Lepikhov <lepihov@gmail.com> wrote: > > 2. It is ok for Vars. But what about expressions? We use equal() in > > distinct, MCV and dependencies modules. Do we need to remove nulls > > before using extended statistics as a general rule? > > AFAIU, the expressions in extended statistics are not decorated with > any nullingrels bits, are they? I've just realized that there may be other places with similar issues, not just in estimate_num_groups. For instance, -- after v16 explain (costs on) select * from t t1 left join t t2 on true where (t2.a+t2.b) is null; QUERY PLAN -------------------------------------------------------------------- Nested Loop Left Join (cost=0.00..15032.50 rows=5000 width=16) Filter: ((t2.a + t2.b) IS NULL) -> Seq Scan on t t1 (cost=0.00..15.00 rows=1000 width=8) -> Materialize (cost=0.00..20.00 rows=1000 width=8) -> Seq Scan on t t2 (cost=0.00..15.00 rows=1000 width=8) (5 rows) -- before v16 explain (costs on) select * from t t1 left join t t2 on true where (t2.a+t2.b) is null; QUERY PLAN -------------------------------------------------------------------- Nested Loop Left Join (cost=0.00..15032.50 rows=1 width=16) Filter: ((t2.a + t2.b) IS NULL) -> Seq Scan on t t1 (cost=0.00..15.00 rows=1000 width=8) -> Materialize (cost=0.00..20.00 rows=1000 width=8) -> Seq Scan on t t2 (cost=0.00..15.00 rows=1000 width=8) (5 rows) In v16 and later, the nullingrels within the expression "t2.a + t2.b" prevent it from being matched to the corresponding expression in extended statistics, forcing us to use DEFAULT_UNK_SEL(0.005). It seems that we need to strip out the nullingrels bits from expressions before matching them to extended statistics or expressional index columns in more places. Thanks Richard
On 25/12/2024 16:36, Richard Guo wrote: > On Wed, Dec 25, 2024 at 5:14 PM Richard Guo <guofenglinux@gmail.com> wrote: >> On Wed, Dec 25, 2024 at 11:34 AM Andrei Lepikhov <lepihov@gmail.com> wrote: >>> 2. It is ok for Vars. But what about expressions? We use equal() in >>> distinct, MCV and dependencies modules. Do we need to remove nulls >>> before using extended statistics as a general rule? >> >> AFAIU, the expressions in extended statistics are not decorated with >> any nullingrels bits, are they? > > I've just realized that there may be other places with similar issues, > not just in estimate_num_groups. For instance, I'm pleased to see that you've grasped my initially unclear idea. Yeah, it seems that all types of statistics may be lost because of varnullingrels. > In v16 and later, the nullingrels within the expression "t2.a + t2.b" > prevent it from being matched to the corresponding expression in > extended statistics, forcing us to use DEFAULT_UNK_SEL(0.005). > > It seems that we need to strip out the nullingrels bits from > expressions before matching them to extended statistics or > expressional index columns in more places. I think Tomas Vondra may have a decisive opinion in this place: we have already discussed some approaches to calculate NULLs generated by RHS of Left Join shortly. Maybe we can commit a quick cure like the one provided in your patch, but we should remember this example - it is not apparent to me how to estimate a group of clauses in the case when part of Vars has varnullingrels and part of them - doesn't. Also, I think this is a good example that an explain analyse summary could have some sort of extended statistics usage report. It can help to clearly identify cases when extended statistics don't work, but should. - something like already implemented in SQL Server. -- regards, Andrei Lepikhov