Re: [HACKERS] <> join selectivity estimate question - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] <> join selectivity estimate question
Date
Msg-id 25906.1489777890@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] <> join selectivity estimate question  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Mar 17, 2017 at 1:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It would not be too hard to convince me that neqjoinsel should
>> simply return 1.0 for any semijoin/antijoin case, perhaps with
>> some kind of discount for nullfrac.  Whether or not there's an
>> equal row, there's almost always going to be non-equal row(s).
>> Maybe we can think of a better implementation but that seems
>> like the zero-order approximation.

> Yeah, it's not obvious how to do better than that considering only one
> clause at a time.  Of course, what we really want to know is
> P(x<>y|z=t), but don't ask me how to compute that.

Yeah.  Another hole in this solution is that it means that the
estimate for x <> y will be quite different from the estimate
for NOT(x = y).  You wouldn't notice it in the field unless
somebody forgot to put a negator link on their equality operator,
but it seems like ideally we'd think of a solution that made sense
for generic NOT in this context.

No, I have no idea how to do that.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver
Next
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] pageinspect and hash indexes