Re: [HACKERS] PATCH: multivariate histograms and MCV lists - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Date
Msg-id fb20a5d8-032c-0678-10c8-522b695af7dd@2ndquadrant.com
Whole thread Raw
In response to Re: [HACKERS] PATCH: multivariate histograms and MCV lists  (Dean Rasheed <dean.a.rasheed@gmail.com>)
List pgsql-hackers

On 03/27/2018 04:58 PM, Dean Rasheed wrote:
> On 27 March 2018 at 01:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> BTW I think there's a bug in handling the fullmatch flag - it should not
>> be passed to AND/OR subclauses the way it is, because then
>>
>>     WHERE a=1 OR (a=2 AND b=2)
>>
>> will probably set it to 'true' because of (a=2 AND b=2). Which will
>> short-circuit the statext_clauselist_selectivity, forcing it to ignore
>> the non-MCV part.
>>
> 
> I'm not sure that's true. Won't the outer call to
> mcv_update_match_bitmap() overwrite the value of fullmatch returned by
> the nested call, and set fullmatch to false because it has only seen 1
> attribute equality match? I think that's the correct result, but I
> think that's just luck.
> 
> The dubious part is the way fullmatch is calculated for OR clauses --
> I think for an OR clause we want to know the attributes matched in
> *every* subclause, rather than in *any* subclause, as we do for AND.
> So I think the only way an OR clause at the top-level should return a
> full match is if every sub-clause was a full match, for example:
> 
>   WHERE (a=1 AND b=2) OR (a=2 AND b=1)
> 

Yes, that seems like the right behavior.

> But then consider this:
> 
>   WHERE a=1 AND (b=1 OR b=2)
> 
> That should also potentially be a full match, but that can only work
> if mcv_update_match_bitmap() returned the set of matching attributes
> (eqmatches), rather than fullmatch, so that it can be merged
> appropriately in the caller. So for an OR clause, it needs to return
> eqmatches containing the list of attributes for which every sub-clause
> matched with equality against the MCV list, and in an outer AND clause
> that can be added to the outer eqmatches list, which is the list of
> attributes for which any sub-clause matched with equality.
> 

I think it's useful to see it transformed from:

    WHERE a=1 AND (b=1 OR b=2)

to

    WHERE (a=1 AND b=1) OR (a=1 AND b=2)

which is the case already handled above. And yes, tracking columns with
an equality seems reasonable.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Next
From: Jesper Pedersen
Date:
Subject: Re: [HACKERS] path toward faster partition pruning