Re: Optimization for updating foreign tables in Postgres FDW - Mailing list pgsql-hackers

From Rushabh Lathia
Subject Re: Optimization for updating foreign tables in Postgres FDW
Date
Msg-id CAGPqQf0GLpwFnCGKrxzJ3OKZdUqF+T9qi_-XHyN2uy2Htou97w@mail.gmail.com
Whole thread Raw
In response to Re: Optimization for updating foreign tables in Postgres FDW  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: Optimization for updating foreign tables in Postgres FDW  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
List pgsql-hackers
I started looking at updated patch and its definitely iked the new approach.

Patch passing the mandatory checks:

1) Patch applied cleanly using patch -p1 command
2) Source got compiled cleanly
3) make check, running cleanly


Explain output for the remote table and inheritance relations looks better
then earlier version of patch.

-- Inheritance foreign relation
postgres=# explain (ANALYZE, VERBOSE) update fp set a = 20;
                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
 Update on public.fp  (cost=100.00..767.60 rows=10920 width=6) (actual time=1.000..1.000 rows=0 loops=1)
   Foreign Update on public.fp
   Foreign Update on public.fc1
   Foreign Update on public.fc2
   Foreign Update on public.fc3
   ->  Foreign Update on public.fp  (cost=100.00..191.90 rows=2730 width=6) (actual time=0.493..0.493 rows=0 loops=1)
         Remote SQL: UPDATE public.p SET a = 20
   ->  Foreign Update on public.fc1  (cost=100.00..191.90 rows=2730 width=6) (actual time=0.177..0.177 rows=0 loops=1)
         Remote SQL: UPDATE public.c1 SET a = 20
   ->  Foreign Update on public.fc2  (cost=100.00..191.90 rows=2730 width=6) (actual time=0.163..0.163 rows=0 loops=1)
         Remote SQL: UPDATE public.c2 SET a = 20
   ->  Foreign Update on public.fc3  (cost=100.00..191.90 rows=2730 width=6) (actual time=0.158..0.158 rows=0 loops=1)
         Remote SQL: UPDATE public.c3 SET a = 20
 Planning time: 0.228 ms
 Execution time: 1.359 ms
(15 rows)

-- Foreign table update
postgres=# explain (ANALYZE, VERBOSE) update ft1 set c8 = 'updated'  where c1 = '200';
                                                      QUERY PLAN                                                     
----------------------------------------------------------------------------------------------------------------------
 Update on public.ft1  (cost=100.00..116.13 rows=2 width=144) (actual time=0.485..0.485 rows=0 loops=1)
   ->  Foreign Update on public.ft1  (cost=100.00..116.13 rows=2 width=144) (actual time=0.483..0.483 rows=0 loops=1)
         Remote SQL: UPDATE public.t1 SET c8 = 'updated   '::character(10) WHERE ((c1 = 200))
 Planning time: 0.158 ms
 Execution time: 0.786 ms
(5 rows)


-- Explain output for the returning clause:
postgres=# explain (ANALYZE, VERBOSE) update ft1 set c8 = 'updated'  where c1 = '200' returning c2;
                                                      QUERY PLAN                                                     
----------------------------------------------------------------------------------------------------------------------
 Update on public.ft1  (cost=100.00..116.13 rows=2 width=144) (actual time=0.516..0.516 rows=0 loops=1)
   Output: c2
   ->  Foreign Update on public.ft1  (cost=100.00..116.13 rows=2 width=144) (actual time=0.514..0.514 rows=0 loops=1)
         Remote SQL: UPDATE public.t1 SET c8 = 'updated   '::character(10) WHERE ((c1 = 200)) RETURNING c2
 Planning time: 0.172 ms
 Execution time: 0.938 ms
(6 rows)


-- Explain output when returning clause is not pushdown safe:
postgres=# explain (ANALYZE, VERBOSE) update ft1 set c8 = 'updated'  where c1 = '200' returning local_func(20);
                                                      QUERY PLAN                                                     
----------------------------------------------------------------------------------------------------------------------
 Update on public.ft1  (cost=100.00..116.13 rows=2 width=144) (actual time=0.364..0.364 rows=0 loops=1)
   Output: local_func(20)
   ->  Foreign Update on public.ft1  (cost=100.00..116.13 rows=2 width=144) (actual time=0.363..0.363 rows=0 loops=1)
         Remote SQL: UPDATE public.t1 SET c8 = 'updated   '::character(10) WHERE ((c1 = 200))
 Planning time: 0.142 ms
 Execution time: 0.623 ms
(6 rows)

-- Explain output with PREPARE:
postgres=# explain (ANALYZE, VERBOSE) execute ftupdate(20);
                                                      QUERY PLAN                                                     
----------------------------------------------------------------------------------------------------------------------
 Update on public.ft1  (cost=100.00..116.13 rows=2 width=144) (actual time=0.712..0.712 rows=0 loops=1)
   ->  Foreign Update on public.ft1  (cost=100.00..116.13 rows=2 width=144) (actual time=0.711..0.711 rows=1 loops=1)
         Remote SQL: UPDATE public.t1 SET c8 = 'updated   '::character(10) WHERE ((c1 = 20))
 Execution time: 1.001 ms
(4 rows)


With the initial look and test overall things looking great, I am still
reviewing the code changes but here are few early doubts/questions:

.) What the need of following change ?

@@ -833,9 +833,6 @@ appendWhereClause(StringInfo buf,
    int         nestlevel;
    ListCell   *lc;
 
-   if (params)
-       *params = NIL;          /* initialize result list to empty */
-
    /* Set up context struct for recursion */
    context.root = root;
    context.foreignrel = baserel;
@@ -971,6 +968,63 @@ deparseUpdateSql(StringInfo buf, PlannerInfo *root,
 }


.) When Tom Lane and Stephen Frost suggested getting the core code involved,
I thought that we can do the mandatory checks into core it self and making
completely out of dml_is_pushdown_safe(). Please correct me

.) Documentation for the new API is missing (fdw-callbacks).



Regards,


On Fri, Dec 25, 2015 at 3:30 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
On 2015/12/24 4:34, Robert Haas wrote:
On Wed, Dec 23, 2015 at 5:50 AM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:
+1.

I like idea of separate FDW API for the DML Pushdown. Was thinking can't we
can re-use the  IterateForeignScan(ForeignScanState *node) rather then
introducing IterateDMLPushdown(ForeignScanState *node) new API ?

Yeah, I think we need to ask ourselves what advantage we're getting
out of adding any new core APIs.  Marking the scan as a pushed-down
update or delete has some benefit in terms of making the information
visible via EXPLAIN, but even that's a pretty thin benefit.  The
iterate method seems to just complicate the core code without any
benefit at all.  More generally, there is very, very little code in
this patch that accomplishes anything that could not be done just as
well with the existing methods.  So why are we even doing these core
changes?

>From the FDWs' point of view, ISTM that what FDWs have to do for IterateDMLPushdown is quite different from what FDWs have to do for IterateForeignScan; eg, IterateDMLPushdown must work in accordance with presence/absence of a RETURNING list.  (In addition to that, IterateDMLPushdown has been designed so that it must make the scan tuple available to later RETURNING projection in nodeModifyTable.c.)  So, I think that it's better to FDWs to add separate APIs for the DML pushdown, making the FDW code much simpler.  So based on that idea, I added the postgres_fdw changes to the patch.  Attached is an updated version of the patch, which is still WIP, but I'd be happy if I could get any feedback.

Tom seemed to think that we could centralize some checks in the core
code, say, related to triggers, or to LIMIT.  But there's nothing like
that in this patch, so I'm not really understanding the point.

For the trigger check, I added relation_has_row_level_triggers.  I placed that function in postgres_fdw.c in the updated patch, but I think that by placing that function in the core, FDWs can share that function.  As for the LIMIT, I'm not sure we can do something about that.

I think the current design allows us to handle a pushed-down update on a join, ie, "UPDATE foo ... FROM bar ..." where both foo and bar are remote, which was Tom's concern, but I'll leave that for another patch.  Also, I think the current design could also extend to push down INSERT .. RETURNING .., but I'd like to leave that for future work.

I'll add this to the next CF.

Best regards,
Etsuro Fujita




Rushabh Lathia

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: Regression caused by recent change to initdb?
Next
From: "Shulgin, Oleksandr"
Date:
Subject: Re: Inconsistent error handling in START_REPLICATION command