Re: Optimization for updating foreign tables in Postgres FDW - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Optimization for updating foreign tables in Postgres FDW
Date
Msg-id CA+TgmoZko+cWGuyjxKt6rfGTtGnEYUOzkng4O0dzpEeANFgqLg@mail.gmail.com
Whole thread Raw
In response to Optimization for updating foreign tables in Postgres FDW  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
List pgsql-hackers
On Tue, Jul 8, 2014 at 3:07 AM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:
> Attached is a WIP patch for the following:
>
> /*
>  * postgresPlanForeignModify
>  *      Plan an insert/update/delete operation on a foreign table
>  *
>  * Note: currently, the plan tree generated for UPDATE/DELETE will always
>  * include a ForeignScan that retrieves ctids (using SELECT FOR UPDATE)
>  * and then the ModifyTable node will have to execute individual remote
>  * UPDATE/DELETE commands.  If there are no local conditions or joins
>  * needed, it'd be better to let the scan node do UPDATE/DELETE RETURNING
>  * and then do nothing at ModifyTable.  Room for future optimization ...
>  */
>
> In the patch postgresPlanForeignModify has been modified so that if, in
> addition to the above condition, the followings are satisfied, then the
> ForeignScan and ModifyTable node will work that way.
>
>  - There are no local BEFORE/AFTER triggers.
>  - In UPDATE it's safe to evaluate expressions to assign to the target
> columns on the remote server.
>
> Here is a simple performance test.
>
> On remote side:
> postgres=# create table t (id serial primary key, inserted timestamp
> default clock_timestamp(), data text);
> CREATE TABLE
> postgres=# insert into t(data) select random() from generate_series(0,
> 99999);
> INSERT 0 100000
> postgres=# vacuum t;
> VACUUM
>
> On local side:
> postgres=# create foreign table ft (id integer, inserted timestamp, data
> text) server myserver options (table_name 't');
> CREATE FOREIGN TABLE
>
> Unpatched:
> postgres=# explain analyze verbose delete from ft where id < 10000;
>                                                       QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------
>  Delete on public.ft  (cost=100.00..162.32 rows=910 width=6) (actual
> time=1275.255..1275.255 rows=0 loops=1)
>    Remote SQL: DELETE FROM public.t WHERE ctid = $1
>    ->  Foreign Scan on public.ft  (cost=100.00..162.32 rows=910 width=6)
> (actual time=1.180..52.095 rows=9999 loops=1)
>          Output: ctid
>          Remote SQL: SELECT ctid FROM public.t WHERE ((id < 10000)) FOR
> UPDATE
>  Planning time: 0.112 ms
>  Execution time: 1275.733 ms
> (7 rows)
>
> Patched (Note that the DELETE command has been pushed down.):
> postgres=# explain analyze verbose delete from ft where id < 10000;
>                                                     QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------
>  Delete on public.ft  (cost=100.00..162.32 rows=910 width=6) (actual
> time=0.006..0.006 rows=0 loops=1)
>    ->  Foreign Scan on public.ft  (cost=100.00..162.32 rows=910 width=6)
> (actual time=0.001..0.001 rows=0 loops=1)
>          Output: ctid
>          Remote SQL: DELETE FROM public.t WHERE ((id < 10000))
>  Planning time: 0.101 ms
>  Execution time: 8.808 ms
> (6 rows)
>
> I'll add this to the next CF.  Comments are welcome.

I haven't looked at the code, but +1 for the general idea.  The
concept seems good to me, and that's a very large performance
improvement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: better atomics - v0.5
Next
From: Stephen Frost
Date:
Subject: Re: RLS Design