Re: [HACKERS] PATCH: multivariate histograms and MCV lists - Mailing list pgsql-hackers

From David Rowley
Subject Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Date
Msg-id CAKJS1f-05u7vQY=aCYv2Ffs9a23wt2hwUPvqnofT6kxVLJAAfw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] PATCH: multivariate histograms and MCV lists  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On Thu, 17 Jan 2019 at 14:19, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>  > 12.  Should we be referencing the source from the docs?
>  >
>  > See <function>mcv_clauselist_selectivity</function>
>  >      in <filename>src/backend/statistics/mcv.c</filename> for details.
>  >
>  > hmm. I see it's not the first going by: git grep -E "\w+\.c\<"
>  > gt
> Hmm, that does not return anything to me - do you actually see any
> references to .c files in the sgml docs? I agree that probably is not a
> good idea, so I'll remove that.

Yeah, I see quite a few. I shouldn't have escaped the <

>  > 18. In dependencies_clauselist_selectivity() there seem to be a new
>  > bug introduced.  We do:
>  >
>  > /* mark this one as done, so we don't touch it again. */
>  > *estimatedclauses = bms_add_member(*estimatedclauses, listidx);
>  >
>  > but the bms_is_member() check that skipped these has been removed.
>  >
>  > It might be easier to document if we just always do:
>  >
>  >    if (bms_is_member(listidx, *estimatedclauses))
>  > continue;
>  >
>  > at the start of both loops. list_attnums can just be left unset for
>  > the originally already estimatedclauses.
>  >
> It's probably not as clear as it should be, but if the clause is already
> estimated (or incompatible), then the list_attnums[] entry will be
> InvalidAttrNumber. Which is what we check in the second loop.

hmm. what about the items that should be skipped when you do the
*estimatedclauses = bms_add_member(*estimatedclauses, listidx); in the
2nd loop. You'll need to either also do list_attnums[listidx] =
InvalidAttrNumber; for them, or put back the bms_is_member() check,
no?  I admit to not having debugged it to find an actual bug, it just
looks suspicious.

>  > 25. Does statext_is_compatible_clause_internal)_ need to skip over
> RelabelTypes?
>  >
> I believe it does, based on what I've observed during development. Why
> do you think it's not necessary?

The other way around. I thought it was necessary, but the code does not do it.

>  > 26. In statext_is_compatible_clause_internal() you mention: /* We only
>  > support plain Vars for now */, but I see nothing that ensures that
>  > only Vars are allowed in the is_opclause() condition.
>  >
>  > /* see if it actually has the right */
>  > ok = (NumRelids((Node *) expr) == 1) &&
>  > (is_pseudo_constant_clause(lsecond(expr->args)) ||
>  > (varonleft = false,
>  >    is_pseudo_constant_clause(linitial(expr->args))));
>  >
>  > the above would allow var+var == const through.
>  >
> But then we call statext_is_compatible_clause_internal on it again, and
> that only allows Vars and "Var op Const" expressions. Maybe there's a
> way around that?

True, I missed that. Drop that one.

>  > 33. "ndimentions"? There's no field in the struct by that name. I'd
>  > assume it's the same size as the isnull array above it?
>  >
>  > Datum    *values; /* variable-length (ndimensions) */
>  >
> Yes, that's the case.

If it relates to the ndimensions field from the struct below, maybe
it's worth crafting that into the comment somehow.


-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Next
From: Andrew Gierth
Date:
Subject: Re: draft patch for strtof()