Home > mailing lists

Re: [HACKERS] Performance improvement for joins where outer side is unique - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: [HACKERS] Performance improvement for joins where outer side is unique
Date	January 27, 2017 19:44:20
Msg-id	1545.1485535460@sss.pgh.pa.us Whole thread Raw
In response to	Re: [HACKERS] Performance improvement for joins where outer side is unique (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: [HACKERS] Performance improvement for joins where outer side is unique Re: [HACKERS] Performance improvement for joins where outer side is unique
List	pgsql-hackers

Tree view

I wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
>> hmm. I'm having trouble understanding why this is a problem for Unique
>> joins, but not for join removal?

> Ah, you know what, that's just mistaken.  I was thinking that we
> short-circuited the join on the strength of the hash (or merge) quals
> only, but actually we check all the joinquals first.  As long as the
> uniqueness proof uses only joinquals and not conditions that will end up
> as otherquals, it's fine.

Actually, after thinking about that some more, it seems to me that there
is a performance (not correctness) issue here: suppose that we have
something like
select ... from t1 left join t2 on t1.x = t2.x and t1.y < t2.y

If there's a unique index on t2.x, we'll be able to mark the join
inner-unique.  However, short-circuiting would only occur after
finding a row that passes both joinquals.  If the y condition is
true for only a few rows, this would pretty nearly disable the
optimization.  Ideally we would short-circuit after testing the x
condition only, but there's no provision for that.

This might not be a huge problem for outer joins.  My sense of typical
SQL style is that the joinquals (ON conditions) are likely to be
exactly what you need to prove inner uniqueness, while random other
conditions will be pushed-down from WHERE and hence will be otherquals.
But I'm afraid it is quite a big deal for inner joins, where we dump
all available conditions into the joinquals.  We might need to rethink
that choice.

At least for merge and hash joins, it's tempting to think about a
short-circuit test being made after testing just the merge/hash quals.
But we'd have to prove uniqueness using only the merge/hash quals,
so the planning cost might be unacceptably high --- particularly for
merge joins which often don't use all available mergeable quals.
In the end I think we probably want to keep the short-circuit in the
same place where it is for SEMI/ANTI cases (which have to have it
exactly there for semantic correctness).

I'm afraid though that we may have to do something about the
irrelevant-joinquals issue in order for this to be of much real-world
use for inner joins.
        regards, tom lane

pgsql-hackers by date:

From: Stephen Frost
Date: 27 January 2017, 19:34:54
Subject: Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superusercheck

From: Stephen Frost
Date: 27 January 2017, 19:49:33
Subject: Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superusercheck

Re: [HACKERS] Performance improvement for joins where outer side is unique - Mailing list pgsql-hackers

Previous

Next