Re: Join optimization - Mailing list pgsql-general

From David Rowley
Subject Re: Join optimization
Date
Msg-id CAApHDvrT+ziY_Gec3=_9jnQi2xYC9XPbaRZkUw-nagiosbVtdA@mail.gmail.com
Whole thread Raw
In response to Re: Join optimization  (luis.roberto@siscobra.com.br)
Responses Re: Join optimization
List pgsql-general
On Sun, 12 Jul 2020 at 06:59, <luis.roberto@siscobra.com.br> wrote:
>
> I'm sorry for the bad example.
>
> Here is another, with some data on PG:
https://dbfiddle.uk/?rdbms=postgres_13&fiddle=ccfd1c4fa291e74a6db9db1772e2b5ac and Oracle:
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=21a98f499065ad4e2c35ff4bd1487e14.

I believe what you're talking about is join removals.  It appears as
though Oracle is able to remove the inner join to the users table as
the join only serves to check the user record exists. No columns are
being selected.  The record must exist due to the foreign key
referencing users.

PostgreSQL currently can only remove left joins. Likely what you could
do here is just change your inner join into a left join. If you're
happy enough that the user record will always exist then that should
allow it to work.

The reason PostgreSQL does not currently support inner join is that by
default, foreign key constraints are only triggered at the end of the
query, (or if deferred, at the end of the transaction). WIth
PostgreSQL, it's possible for a SELECT query to see a violated foreign
key constraint.  This can happen if your SELECT query calls a function
which updates a referenced record.  The cascade of the foreign key
won't occur until the end of the statement, so the select may stumble
upon a violated foreign key.

Here's a quick example of this case:
drop table t1,t2;
create table t1 (a int primary key);
create table t2 (a int references t1 on update cascade);

insert into t1 values(1),(2);
insert into t2 values(2),(2);
create or replace function update_t1 (p_a int) returns int as $$ begin
update t1 set a = a + 1 where a = p_a; return p_a; end; $$ language
plpgsql volatile;

-- in theory, this should never return any rows as we're asking for
all rows that
-- don't exist in the referenced table. However, we do get a result
since the function
-- call updates t1 setting the row with a=2 to a=3. The cascade to t2
does not occur
-- until the end of the statement.
select update_t1(a+1),t1.a from t1 where not exists(select 1 from t2
where t1.a=t2.a);
 update_t1 | a
-----------+---
         2 | 1
(1 row)

If you're happy that you don't have any functions like that which
could momentarily cause the foreign key to appear violated, then there
shouldn't be any harm in changing the INNER JOIN on users to a LEFT
JOIN.  PostgreSQL will be able to remove the join in that case.

David



pgsql-general by date:

Previous
From: luis.roberto@siscobra.com.br
Date:
Subject: Re: Join optimization
Next
From: Rita
Date:
Subject: Re: Listen/Notify feedback