Re: Removing unneeded self joins - Mailing list pgsql-hackers

From Alexander Kuzmenkov
Subject Re: Removing unneeded self joins
Date
Msg-id 552e481b-2feb-75fd-4e9f-4199bfd1c1f3@postgrespro.ru
Whole thread Raw
In response to Re: Removing unneeded self joins  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Removing unneeded self joins
List pgsql-hackers
On 3/14/19 14:21, David Rowley wrote:

> What do you think?


Let's recap the conditions when we can remove a self-join. It is when 
for each outer row, 1) at most one inner row matches the join clauses, 
and 2) it is the same row as the outer one. I'm not sure what (2) means 
precisely in a general case, but for a plain table, we can identify 
these rows by ctid. So when both sides have the same unique index with 
the same clauses, we conclude that we are always dealing with the same 
row (as identified by ctid) on both sides, hence the join can be 
replaced with a scan.

The code I wrote just checks for the above conditions. The data we need 
for these checks is a byproduct of checking the relations for 
uniqueness, which we do anyway, so we just cache it for a negligible cost.

I didn't write it in a more generic way because I don't understand the 
conditions for generic case. In your DISTINCT example, the join can be 
removed indeed. But if we select some columns from the inner side apart 
from the join ones, we can't remove the join anymore:

select * from t1, (select distinct on (a) a, b from t1) tt where t1.a = 
tt.a;

I think this might be a different kind of optimization, where we remove 
the self-join if the inner side is unique, and no inner columns are 
selected besides the join ones.


Also, reading your letter I realized that I don't commute the index 
clauses correctly before comparing them in is_unique_self_join, so I 
fixed this in the new version of the patch.


-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Should we add GUCs to allow partition pruning to be disabled?
Next
From: David Steele
Date:
Subject: Re: Add exclusive backup deprecation notes to documentation