Re: PoC/WIP: Extended statistics on expressions - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: PoC/WIP: Extended statistics on expressions
Date
Msg-id CAEZATCW2WHfsrbDLzwMvD+kOsBemLhvKwoHGhzR7STtac74XTg@mail.gmail.com
Whole thread Raw
In response to Re: PoC/WIP: Extended statistics on expressions  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: PoC/WIP: Extended statistics on expressions  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Wed, 24 Mar 2021 at 14:48, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> AFAIK the primary issue here is that the two places disagree. While
> estimate_num_groups does this
>
>     varnos = pull_varnos(root, (Node *) varshere);
>     if (bms_membership(varnos) == BMS_SINGLETON)
>     { ... }
>
> the add_unique_group_expr does this
>
>     varnos = pull_varnos(root, (Node *) groupexpr);
>
> That is, one looks at the group expression, while the other look at vars
> extracted from it by pull_var_clause(). Apparently for PlaceHolderVar
> this can differ, causing the crash.
>
> So we need to change one of those places - my fix tweaked the second
> place to also look at the vars, but maybe we should change the other
> place? Or maybe it's not the right fix for PlaceHolderVars ...
>

I think that it doesn't make any difference which place is changed.

This is a case of an expression with no stats. With your change,
you'll get a single GroupExprInfo containing a list of
VariableStatData's for each of it's Var's, whereas if you changed it
the other way, you'd get a separate GroupExprInfo for each Var. But I
think they'd both end up being treated the same by
estimate_multivariate_ndistinct(), since there wouldn't be any stats
matching the expression, only the individual Var's. Maybe changing the
first place would be the more bulletproof fix though.

Regards,
Dean



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Custom compression methods
Next
From: Bernd Helmle
Date:
Subject: Re: postgres_fdw: IMPORT FOREIGN SCHEMA ... LIMIT TO (partition)