Re: Add semi-join pushdown to postgres_fdw - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Add semi-join pushdown to postgres_fdw
Date
Msg-id CAPpHfdtnBO87ryw3cZ148y3XAjjTYmS+a0Nin9qTY9guRiU_Uw@mail.gmail.com
Whole thread Raw
In response to Re: Add semi-join pushdown to postgres_fdw  (Pavel Luzanov <p.luzanov@postgrespro.ru>)
List pgsql-hackers
On Fri, Feb 9, 2024 at 10:08 PM Pavel Luzanov <p.luzanov@postgrespro.ru> wrote:
> While playing with this feature I found the following.
>
> Two foreign tables:
> postgres@demo_postgres_fdw(17.0)=# \det aircrafts|seats
>       List of foreign tables
>  Schema |   Table   |   Server
> --------+-----------+-------------
>  public | aircrafts | demo_server
>  public | seats     | demo_server
> (2 rows)
>
>
> This query uses optimization:
>
> postgres@demo_postgres_fdw(17.0)=# EXPLAIN (costs off, verbose) SELECT *
> FROM aircrafts a
> WHERE a.aircraft_code = '320' AND EXISTS (
>   SELECT * FROM seats s WHERE s.aircraft_code = a.aircraft_code
> );
>                                                                                                  QUERY PLAN
                                                                           > 
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------->
>  Foreign Scan
>    Output: a.aircraft_code, a.model, a.range
>    Relations: (public.aircrafts a) SEMI JOIN (public.seats s)
>    Remote SQL: SELECT r1.aircraft_code, r1.model, r1.range FROM bookings.aircrafts r1 WHERE ((r1.aircraft_code =
'320'))AND EXISTS (SELECT NULL FROM bookings.seats r2 WHERE ((r2.aircraft_code => 
> (4 rows)
>
>
> But optimization not used for NOT EXISTS:
>
> postgres@demo_postgres_fdw(17.0)=# EXPLAIN (costs off, verbose) SELECT *
> FROM aircrafts a
> WHERE a.aircraft_code = '320' AND NOT EXISTS (
>   SELECT * FROM seats s WHERE s.aircraft_code = a.aircraft_code
> );
>                                                    QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------
>  Nested Loop Anti Join
>    Output: a.aircraft_code, a.model, a.range
>    ->  Foreign Scan on public.aircrafts a
>          Output: a.aircraft_code, a.model, a.range
>          Remote SQL: SELECT aircraft_code, model, range FROM bookings.aircrafts WHERE ((aircraft_code = '320'))
>    ->  Materialize
>          Output: s.aircraft_code
>          ->  Foreign Scan on public.seats s
>                Output: s.aircraft_code
>                Remote SQL: SELECT aircraft_code FROM bookings.seats WHERE ((aircraft_code = '320'))
> (10 rows)
>
> Also, optimization not used after deleting first condition (a.aircraft_code = '320'):
>
> postgres@demo_postgres_fdw(17.0)=# EXPLAIN (costs off, verbose) SELECT *
> FROM aircrafts a
> WHERE EXISTS (
>   SELECT * FROM seats s WHERE s.aircraft_code = a.aircraft_code
> );
>                                    QUERY PLAN
> --------------------------------------------------------------------------------
>  Hash Join
>    Output: a.aircraft_code, a.model, a.range
>    Inner Unique: true
>    Hash Cond: (a.aircraft_code = s.aircraft_code)
>    ->  Foreign Scan on public.aircrafts a
>          Output: a.aircraft_code, a.model, a.range
>          Remote SQL: SELECT aircraft_code, model, range FROM bookings.aircrafts
>    ->  Hash
>          Output: s.aircraft_code
>          ->  HashAggregate
>                Output: s.aircraft_code
>                Group Key: s.aircraft_code
>                ->  Foreign Scan on public.seats s
>                      Output: s.aircraft_code
>                      Remote SQL: SELECT aircraft_code FROM bookings.seats
> (15 rows)
>
>
> But the worst thing is that replacing AND with OR causes breaking session and server restart:
>
> postgres@demo_postgres_fdw(17.0)=# EXPLAIN (costs off, verbose) SELECT *
> FROM aircrafts a
> WHERE a.aircraft_code = '320' OR EXISTS (
>   SELECT * FROM seats s WHERE s.aircraft_code = a.aircraft_code
> );
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> The connection to the server was lost. Attempting reset: Failed.

Thank you, Pavel. I'm looking into this.

------
Regards,
Alexander Korotkov



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: glibc qsort() vulnerability
Next
From: Dave Cramer
Date:
Subject: Re: [PATCH] Add native windows on arm64 support