On Tue, Mar 17, 2020 at 12:42:52PM +0000, Dean Rasheed wrote:
>On Sat, 14 Mar 2020 at 18:45, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>
>> I realized there's one more thing that probably needs discussing.
>> Essentially, these two clause types are the same:
>>
>> a IN (1, 2, 3)
>>
>> (a = 1 OR a = 2 OR a = 3)
>>
>> but with 8f321bd1 we only recognize the first one as compatible with
>> functional dependencies. It was always the case that we estimated those
>> two clauses a bit differently, but the differences were usually small.
>> But now that we recognize IN as compatible with dependencies, the
>> difference may be much larger, which bugs me a bit ...
>>
>> So I wonder if we should recognize the special form of an OR clause,
>> with all Vars referencing to the same attribute etc. and treat this as
>> supported by functional dependencies - the attached patch does that.
>> MCV lists there's already no difference because OR clauses are
>> supported.
>>
>
>Makes sense, and the patch looks straightforward enough.
>
>> The question is whether we want to do this, and whether we should also
>> teach the per-column estimates to recognize this special case of IN
>> clause.
>
>I'm not convinced about that second part though. I'd say that
>recognising the OR clause for functional dependencies is sufficient to
>prevent the large differences in estimates relative to the equivalent
>IN clauses. The small differences between the way that OR and IN
>clauses are handled have always been there, and I think that changing
>that is out of scope for this work.
>
Not sure. I think the inconsistency between plan and extended stats may
be a bit surprising, but I agree that issue may be negligible.
>The other thing that I'm still concerned about is the possibility of
>returning estimates with P(a,b) > P(a) or P(b). I think that such a
>thing becomes much more likely with the new types of clause supported
>here, because they now allow multiple values from each column, where
>before we only allowed one. I took another look at the patch I posted
>on the other thread, and I've convinced myself that it's correct.
>Attached is an updated version, with some cosmetic tidying up and now
>with some additional regression tests.
>
Yeah, I agree that's something we need to fix. Do you plan to push the
fix, or do you want me to do it?
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services