Re: [HACKERS] Performance improvement for joins where outer side is unique - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Performance improvement for joins where outer side is unique
Date
Msg-id 18580.1491522470@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Performance improvement for joins where outer side is unique  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: [HACKERS] Performance improvement for joins where outer side is unique  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
David Rowley <david.rowley@2ndquadrant.com> writes:
> On 7 April 2017 at 07:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm looking through this, and I'm failing to see where it deals with
>> the problem we discussed last time, namely that you can't apply the
>> optimization unless all clauses that were used in the uniqueness
>> proof are included in the join's merge/hash conditions + joinquals.

> The code in question is:
> mergestate->mj_SkipMarkRestore = !mergestate->js.joinqual &&
> mergestate->js.first_inner_tuple_only;

Uh, AFAICS that only protects the skip-mark-and-restore logic.
What I'm on about is that you can't do the early advance to the
next outer tuple unless you're sure that all the quals that were
relevant to the uniqueness proof have been checked for the current
inner tuple.  That affects all three join types not only merge.

The case that would be relevant to this is, eg,
   create table t1 (f1 int, f2 int, primary key(f1,f2));
   select * from t_outer left join t1 on (t_outer.f1 = t1.f1)   where t_outer.f2 = t2.f2;

Your existing patch would think t1 is unique-inner, but the qual pushed
down from WHERE would not be a joinqual so the wrong thing would happen
at runtime.

(Hm ... actually, this example wouldn't fail as written because
the WHERE qual is probably strict, so the left join would get
reduced to an inner join and then pushed-down-ness no longer
matters.  But hopefully you get my drift.)

> I don't really think the List idea would be nearly as efficient.

No, what I'm saying is that each RelOptInfo would contain a single List of
Relids of proven-unique-for outer rels (and another one for the negative
cache).  No array, no more searching than you have now, just removal of an
uncertainly-safe array fetch.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Beena Emerson
Date:
Subject: Re: [HACKERS] increasing the default WAL segment size
Next
From: David Rowley
Date:
Subject: Re: [HACKERS] Making clausesel.c Smarter