Re: New hashed IN code ignores distinctiveness of subquery - Mailing list pgsql-bugs

From Tom Lane
Subject Re: New hashed IN code ignores distinctiveness of subquery
Date
Msg-id 19407.1043641111@sss.pgh.pa.us
Whole thread Raw
In response to Re: New hashed IN code ignores distinctiveness of subquery  (Bradley Baetz <bbaetz@acm.org>)
Responses Re: New hashed IN code ignores distinctiveness of subquery  (Bradley Baetz <bbaetz@acm.org>)
List pgsql-bugs
Bradley Baetz <bbaetz@acm.org> writes:
> On Sun, Jan 26, 2003 at 09:43:18PM -0500, Tom Lane wrote:
>> We're already checking that as a separate plan alternative.  The
>> implementation could be improved a little, though --- we could combine
>> the uniq-ification into the Hash node.

> Right, or skip it entirely when selecting stuff with unique constraints.

I'm hesitant to do that until we have some scheme in place for
invalidating cached plans.  Right now, if you drop an index that is used
by some cached plan, you'll hear about it next time the plan is used.
But if the plan doesn't directly use the index, yet depends on its
existence to be correct, you'll get no error and subtly(?) wrong
answers.  I don't mind depending on such assumptions in estimating
costs, but I do mind depending on them for correct answers.

> I don't think it is. The number of rows is correct if you do product_id
> IN (1) vs product_id IN (1,2) vs product_id IN (1,2,3) and so on.

But that's a completely different code path; it doesn't even enter the
routines we're concerned about here.

> cost_hashjoin
> probably needs to be taught about the short circuiting done for _IN,

Yeah, I think so.  Stepping through cost_hashjoin shows that the major
component of the inflated cost is coming from the per-tuple CPU costs,
which are inflated because we're not allowing for the IN shortcircuit.

> What is the point of JOIN_UNIQUE_{INNER,OUTER}, though? What does it do
> that JOIN_IN doesn't?

Uniqify the inner/outer path and then do a normal inner join.  See
joinpath.c.

> executor/* doesn't appear to use it.

No; the executor never sees JOIN_REVERSE_IN, either.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Bradley Baetz
Date:
Subject: Re: New hashed IN code ignores distinctiveness of subquery
Next
From: Bradley Baetz
Date:
Subject: Re: New hashed IN code ignores distinctiveness of subquery