Re: Removing unneeded self joins - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Removing unneeded self joins
Date
Msg-id CAM-w4HP9piqhdTLzmPGBLVxKvkRxdFYNchRK3+4Tkn_Rx9p1hQ@mail.gmail.com
Whole thread Raw
In response to Re: Removing unneeded self joins  (Ronan Dunklau <ronan.dunklau@aiven.io>)
Responses Re: Removing unneeded self joins  (Andrey Lepikhov <a.lepikhov@postgrespro.ru>)
List pgsql-hackers
On Thu, 1 Jul 2021 at 02:38, Ronan Dunklau <ronan.dunklau@aiven.io> wrote:
>
> Well in some cases they can't, when the query is not emitting redundant
> predicates by itself but they are added by something else like a view or a RLS
> policy.
> Maybe it would be worth it to allow spending a bit more time planning for
> those cases ?

Yeah, I'm generally in favour of doing more work in the optimizer to
save query authors work writing queries.

My question is whether it handles cases like:

select b.x,c.y
  from t
   join t2 as b on (b.id = t.id)
      join t2 as c on (c.id = t.id)

That is, if you join against the same table twice on the same qual.
Does the EC mechanism turn this into a qual on b.id = c.id and then
turn this into a self-join that can be removed?

That's the usual pattern I've seen this arise. Not so much that people
write self joins explicitly but that they add a join to check some
column but that is happening in some isolated piece of code that
doesn't know that that join is already in the query. You can easily
end up with a lot of joins against the same table this way.

It's not far different from the old chestnut

select (select x from t2 where id = t.id) as x,
       (select y from t2 where id = t.id) as y
  from t

which is actually pretty hard to avoid sometimes.

-- 
greg



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Patch: Code comments: why some text-handling functions are leakproof
Next
From: Chapman Flack
Date:
Subject: Re: real/float example for testlibpq3