Re: Removing unneeded self joins - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: Removing unneeded self joins
Date
Msg-id b2f522c6-c150-4f1f-a722-f70e9a75fe13@postgrespro.ru
Whole thread Raw
In response to Re: Removing unneeded self joins  (Alena Rybakina <lena.ribackina@yandex.ru>)
List pgsql-hackers
On 11/10/2023 02:29, Alena Rybakina wrote:
> I have reviewed your patch and I noticed a few things.

Thanks for your efforts,

> I have looked at the latest version of the code, I assume that the error 
> lies in the replace_varno_walker function, especially in the place where 
> we check the node by type Var, and does not form any NullTest node.

It's not a bug, it's an optimization we discussed with Alexander above.

> Secondly, I added some code in some places to catch erroneous cases and 
> added a condition when we should not try to apply the self-join-removal 
> transformation due to the absence of an empty self-join list after 
> searching for it and in general if there are no joins in the query. 
> Besides, I added a query for testing and wrote about it above. I have 
> attached my diff file.
Ok, I will look at this
> In addition, I found a comment for myself that was not clear to me. I 
> would be glad if you could explain it to me.
> 
> You mentioned superior outer join in the comment, unfortunately, I 
> didn't find anything about it in the PostgreSQL code, and this 
> explanation remained unclear to me. Could you explain in more detail 
> what you meant?
I meant here that only clauses pushed by reconsider_outer_join_clauses() 
can be found in the joininfo list, and they are not relevant, as you can 
understand.
Having written that, I realized that it was a false statement. ;) - 
joininfo can also contain results of previous SJE iterations, look:

CREATE TABLE test (oid int PRIMARY KEY);
CREATE UNIQUE INDEX ON test((oid*oid));
explain
SELECT count(*)
FROM test c1, test c2, test c3
WHERE c1.oid=c2.oid AND c1.oid*c2.oid=c3.oid*c3.oid;
explain
SELECT count(*)
FROM test c1, test c2, test c3
WHERE c1.oid=c3.oid AND c1.oid*c3.oid=c2.oid*c2.oid;
explain
SELECT count(*)
FROM test c1, test c2, test c3
WHERE c3.oid=c2.oid AND c3.oid*c2.oid=c1.oid*c1.oid;

Having executed this SQL code, you could see that in the last query, the 
SJE feature didn't delete one of the JOINs because of the reason I had 
written above.
It's not an one-minute fix - I will try to propose solution a bit later.

-- 
regards,
Andrey Lepikhov
Postgres Professional




pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: interval_ops shall stop using btequalimage (deduplication)
Next
From: Michael Paquier
Date:
Subject: Re: stopgap fix for signal handling during restore_command