Home > mailing lists

Re: Removing unneeded self joins - Mailing list pgsql-hackers

From	Alexander Kuzmenkov
Subject	Re: Removing unneeded self joins
Date	March 25, 2019 16:45:02
Msg-id	89c8a17d-01f8-a03d-7fb8-0a8a070be47c@postgrespro.ru Whole thread Raw
In response to	Re: Removing unneeded self joins (David Rowley <david.rowley@2ndquadrant.com>)
Responses	Re: Removing unneeded self joins
List	pgsql-hackers

Tree view

On 3/25/19 07:07, David Rowley wrote:
> You had commented the test with:
>
> -- If index conditions are different for each side, we won't select the same
> -- row on both sides, so the join can't be removed.
>
> but I don't quite understand why we can't remove the join in this
> situation.

My rationale was that we're not exactly removing the join, but replacing 
it with a scan. So it is not enough to just have a single row on each 
side, it must be the same physical row. In this example, before the 
transformation, t1.b is not equal to t2.b, but they become equal 
afterwards. This looks somewhat wrong. On the other hand, if the 
conditions are different, the resulting condition is going to evaluate 
to constant false and we won't get any rows, so maybe it's OK.

This brings me again to the question of what are the conditions for join 
removal. If the formulation with indexes is not general enough, what do 
we use instead? I guess it could be something like this:

1. Given the (btree equality) join and restriction clauses, both sides 
are unique on the same set of columns. That is, if we fix the values of 
these columns, both sides have at most one matching row.

     a. For each of these columns, we have either

         i) a join clause that equates some expression referencing the 
outer column to the same expression referencing the same inner column.

         ii) a clause for each relation that equates the same expression 
referencing the outer and inner column to some other arbitrary 
expression, possibly a different one for each side. This expression may 
be a Const or some expression that references a Var of some third relation.

2. All the resulting columns can be calculated using either side of the 
join. For now, just require that both sides are base relations that 
refer to the same physical relation.

Two points are not clear to me here:

1. We don't handle join clauses to third relations, but can they be 
treated the same way we treat Consts?

2. Can we simplify the join when we don't have any join clauses and only 
have Consts? Or should we have at least one join clause that equates the 
same inner and outer column? Why is one join clause enough?

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgsql-hackers by date:

From: Robert Haas
Date: 25 March 2019, 16:28:57
Subject: Re: Protect syscache from bloating with negative cache entries

From: Tom Lane
Date: 25 March 2019, 16:53:05
Subject: Re: Feature: Add Greek language fulltext search

Re: Removing unneeded self joins - Mailing list pgsql-hackers

Previous

Next