Thread: Push down more UPDATEs/DELETEs in postgres_fdw

Push down more UPDATEs/DELETEs in postgres_fdw

From
Etsuro Fujita
Date:
Hi,

Attached is a WIP patch extending the postgres_fdw DML pushdown in 9.6
so that it can perform an update/delete on a join remotely.  An example
is shown below:

* without the patch:
postgres=# explain verbose delete from ft1 using ft2 where ft1.a = ft2.a;

              QUERY PLAN


--------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------
  Delete on public.ft1  (cost=100.00..102.04 rows=1 width=38)
    Remote SQL: DELETE FROM public.t1 WHERE ctid = $1
    ->  Foreign Scan  (cost=100.00..102.04 rows=1 width=38)
          Output: ft1.ctid, ft2.*
          Relations: (public.ft1) INNER JOIN (public.ft2)
          Remote SQL: SELECT r1.ctid, CASE WHEN (r2.*)::text IS NOT NULL
THEN ROW(r2.a, r2.b) END FROM (public.t1 r1 INNER JOIN public.t2 r2 ON
(((r1.a =
  r2.a)))) FOR UPDATE OF r1
          ->  Nested Loop  (cost=200.00..202.07 rows=1 width=38)
                Output: ft1.ctid, ft2.*
                Join Filter: (ft1.a = ft2.a)
                ->  Foreign Scan on public.ft1  (cost=100.00..101.03
rows=1 width=10)
                      Output: ft1.ctid, ft1.a
                      Remote SQL: SELECT a, ctid FROM public.t1 FOR UPDATE
                ->  Foreign Scan on public.ft2  (cost=100.00..101.03
rows=1 width=36)
                      Output: ft2.*, ft2.a
                      Remote SQL: SELECT a, b FROM public.t2
(15 rows)

* with the patch:
postgres=# explain verbose delete from ft1 using ft2 where ft1.a = ft2.a;
                                                          QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------
  Delete on public.ft1  (cost=100.00..102.04 rows=1 width=38)
    ->  Foreign Delete  (cost=100.00..102.04 rows=1 width=38)
          Remote SQL: DELETE FROM public.t1 r1 USING (SELECT ROW(a, b),
a FROM public.t2) ss1(c1, c2) WHERE ((r1.a = ss1.c2))
(3 rows)

The WIP patch has been created on top of the join pushdown patch [1].
So, for testing, please apply the patch in [1] first.

I'll add this to the the November commitfest.

Best regards,
Etsuro Fujita

[1]
https://www.postgresql.org/message-id/1688885b-5fb1-8bfa-b1b8-c2758dbe0b38@lab.ntt.co.jp

Attachment

Re: Push down more UPDATEs/DELETEs in postgres_fdw

From
Ashutosh Bapat
Date:
Thanks Fujita-san for working on this.


* with the patch:
postgres=# explain verbose delete from ft1 using ft2 where ft1.a = ft2.a;
                                                         QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
 Delete on public.ft1  (cost=100.00..102.04 rows=1 width=38)
   ->  Foreign Delete  (cost=100.00..102.04 rows=1 width=38)
         Remote SQL: DELETE FROM public.t1 r1 USING (SELECT ROW(a, b), a FROM public.t2) ss1(c1, c2) WHERE ((r1.a = ss1.c2))
(3 rows)

The WIP patch has been created on top of the join pushdown patch [1]. So, for testing, please apply the patch in [1] first.


The underlying scan on t2 requires ROW(a,b) for locking the row for update/share. But clearly it's not required if the full query is being pushed down. Is there a way we can detect that ROW(a,b) is useless column (not used anywhere in the other parts of the query like RETURNING, DELETE clause etc.) and eliminate it? Similarly for a, it's part of the targetlist of the underlying scan so that the WHERE clause can be applied on it. But it's not needed if we are pushing down the query. If we eliminate the targetlist of the query, we could construct a remote query without having subquery in it, making it more readable.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Push down more UPDATEs/DELETEs in postgres_fdw

From
Etsuro Fujita
Date:
On 2016/09/07 13:21, Ashutosh Bapat wrote:
>     * with the patch:
>     postgres=# explain verbose delete from ft1 using ft2 where ft1.a =
>     ft2.a;
>                                                              QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------
>      Delete on public.ft1  (cost=100.00..102.04 rows=1 width=38)
>        ->  Foreign Delete  (cost=100.00..102.04 rows=1 width=38)
>              Remote SQL: DELETE FROM public.t1 r1 USING (SELECT ROW(a,
>     b), a FROM public.t2) ss1(c1, c2) WHERE ((r1.a = ss1.c2))
>     (3 rows)

> The underlying scan on t2 requires ROW(a,b) for locking the row for
> update/share. But clearly it's not required if the full query is being
> pushed down.

Exactly.

> Is there a way we can detect that ROW(a,b) is useless
> column (not used anywhere in the other parts of the query like
> RETURNING, DELETE clause etc.) and eliminate it?

I don't have a clear solution for that yet, but I'll try to remove that 
in the next version.

> Similarly for a, it's
> part of the targetlist of the underlying scan so that the WHERE clause
> can be applied on it. But it's not needed if we are pushing down the
> query. If we eliminate the targetlist of the query, we could construct a
> remote query without having subquery in it, making it more readable.

Will try to do so also.

Thanks for the comments!

Best regards,
Etsuro Fujita





Re: Push down more UPDATEs/DELETEs in postgres_fdw

From
Etsuro Fujita
Date:
On 2016/09/08 19:55, Etsuro Fujita wrote:
> On 2016/09/07 13:21, Ashutosh Bapat wrote:
>>     * with the patch:
>>     postgres=# explain verbose delete from ft1 using ft2 where ft1.a =
>>     ft2.a;
>>                                                              QUERY PLAN
>>
>>
-----------------------------------------------------------------------------------------------------------------------------
>>
>>      Delete on public.ft1  (cost=100.00..102.04 rows=1 width=38)
>>        ->  Foreign Delete  (cost=100.00..102.04 rows=1 width=38)
>>              Remote SQL: DELETE FROM public.t1 r1 USING (SELECT ROW(a,
>>     b), a FROM public.t2) ss1(c1, c2) WHERE ((r1.a = ss1.c2))
>>     (3 rows)
>
>> The underlying scan on t2 requires ROW(a,b) for locking the row for
>> update/share. But clearly it's not required if the full query is being
>> pushed down.

>> Is there a way we can detect that ROW(a,b) is useless
>> column (not used anywhere in the other parts of the query like
>> RETURNING, DELETE clause etc.) and eliminate it?

> I don't have a clear solution for that yet, but I'll try to remove that
> in the next version.

>> Similarly for a, it's
>> part of the targetlist of the underlying scan so that the WHERE clause
>> can be applied on it. But it's not needed if we are pushing down the
>> query. If we eliminate the targetlist of the query, we could construct a
>> remote query without having subquery in it, making it more readable.

> Will try to do so also.

I addressed this by improving the deparse logic so that a remote query
for performing an UPDATE/DELETE on a join directly on the remote can be
created as proposed if possible.  Attached is an updated version of the
patch, which is created on top of the patch set [1].  The patch is still
WIP (ie, needs more comments and regression tests, at least), but any
comments would be gratefully appreciated.

Best regards,
Etsuro Fujita

[1]
https://www.postgresql.org/message-id/11eafd10-d3f8-ac8a-b642-b0e65037c76b%40lab.ntt.co.jp

Attachment

Re: Push down more UPDATEs/DELETEs in postgres_fdw

From
Rushabh Lathia
Date:
Thanks Fujita-san for working on this. I've signed up to review this patch.

Your latest patch doesn't not get apply cleanly apply on master branch.

patching file contrib/postgres_fdw/deparse.c
6 out of 17 hunks FAILED -- saving rejects to file contrib/postgres_fdw/deparse.c.rej
patching file contrib/postgres_fdw/expected/postgres_fdw.out
2 out of 2 hunks FAILED -- saving rejects to file contrib/postgres_fdw/expected/postgres_fdw.out.rej
patching file contrib/postgres_fdw/postgres_fdw.c
2 out of 22 hunks FAILED -- saving rejects to file contrib/postgres_fdw/postgres_fdw.c.rej
patching file contrib/postgres_fdw/postgres_fdw.h
1 out of 3 hunks FAILED -- saving rejects to file contrib/postgres_fdw/postgres_fdw.h.rej

Please share the patch which get apply clean on master branch.



On Fri, Nov 11, 2016 at 5:00 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
On 2016/09/08 19:55, Etsuro Fujita wrote:
On 2016/09/07 13:21, Ashutosh Bapat wrote:
    * with the patch:
    postgres=# explain verbose delete from ft1 using ft2 where ft1.a =
    ft2.a;
                                                             QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------

     Delete on public.ft1  (cost=100.00..102.04 rows=1 width=38)
       ->  Foreign Delete  (cost=100.00..102.04 rows=1 width=38)
             Remote SQL: DELETE FROM public.t1 r1 USING (SELECT ROW(a,
    b), a FROM public.t2) ss1(c1, c2) WHERE ((r1.a = ss1.c2))
    (3 rows)

The underlying scan on t2 requires ROW(a,b) for locking the row for
update/share. But clearly it's not required if the full query is being
pushed down.

Is there a way we can detect that ROW(a,b) is useless
column (not used anywhere in the other parts of the query like
RETURNING, DELETE clause etc.) and eliminate it?

I don't have a clear solution for that yet, but I'll try to remove that
in the next version.

Similarly for a, it's
part of the targetlist of the underlying scan so that the WHERE clause
can be applied on it. But it's not needed if we are pushing down the
query. If we eliminate the targetlist of the query, we could construct a
remote query without having subquery in it, making it more readable.

Will try to do so also.

I addressed this by improving the deparse logic so that a remote query for performing an UPDATE/DELETE on a join directly on the remote can be created as proposed if possible.  Attached is an updated version of the patch, which is created on top of the patch set [1].  The patch is still WIP (ie, needs more comments and regression tests, at least), but any comments would be gratefully appreciated.

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/11eafd10-d3f8-ac8a-b642-b0e65037c76b%40lab.ntt.co.jp


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers




--
Rushabh Lathia

Re: Push down more UPDATEs/DELETEs in postgres_fdw

From
Etsuro Fujita
Date:
On 2016/11/15 19:04, Rushabh Lathia wrote:
> Thanks Fujita-san for working on this. I've signed up to review this patch.

Thank you for reviewing the patch!

> Your latest patch doesn't not get apply cleanly apply on master branch.

Did you apply the patch set in [1] 
(postgres-fdw-subquery-support-v4.patch and 
postgres-fdw-phv-pushdown-v4.patch in this order) before applying the 
latest patch?

Best regards,
Etsuro Fujita

[1] 
https://www.postgresql.org/message-id/11eafd10-d3f8-ac8a-b642-b0e65037c76b%40lab.ntt.co.jp





Re: Push down more UPDATEs/DELETEs in postgres_fdw

From
Ashutosh Bapat
Date:
On Wed, Nov 16, 2016 at 8:25 AM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:
> On 2016/11/15 19:04, Rushabh Lathia wrote:
>>
>> Thanks Fujita-san for working on this. I've signed up to review this
>> patch.
>
>
> Thank you for reviewing the patch!
>
>> Your latest patch doesn't not get apply cleanly apply on master branch.
>
>
> Did you apply the patch set in [1] (postgres-fdw-subquery-support-v4.patch
> and postgres-fdw-phv-pushdown-v4.patch in this order) before applying the
> latest patch?
>

I don't see any reason why DML/UPDATE pushdown should depend upon
subquery deparsing or least PHV patch. Combined together they can help
in more cases, but without those patches, we will be able to push-down
more stuff. Probably, we should just restrict push-down only for the
cases when above patches are not needed. That makes reviews easy. Once
those patches get committed, we may add more functionality depending
upon the status of this patch. Does that make sense?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: Push down more UPDATEs/DELETEs in postgres_fdw

From
Etsuro Fujita
Date:
On 2016/11/16 13:10, Ashutosh Bapat wrote:
> On Wed, Nov 16, 2016 at 8:25 AM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp> wrote:
>> On 2016/11/15 19:04, Rushabh Lathia wrote:
>>> Your latest patch doesn't not get apply cleanly apply on master branch.

>> Did you apply the patch set in [1] (postgres-fdw-subquery-support-v4.patch
>> and postgres-fdw-phv-pushdown-v4.patch in this order) before applying the
>> latest patch?

> I don't see any reason why DML/UPDATE pushdown should depend upon
> subquery deparsing or least PHV patch. Combined together they can help
> in more cases, but without those patches, we will be able to push-down
> more stuff. Probably, we should just restrict push-down only for the
> cases when above patches are not needed. That makes reviews easy. Once
> those patches get committed, we may add more functionality depending
> upon the status of this patch. Does that make sense?

OK, I'll extract from the patch the minimal part that wouldn't depend on 
the two patches.

Best regards,
Etsuro Fujita





Re: Push down more UPDATEs/DELETEs in postgres_fdw

From
Etsuro Fujita
Date:
On 2016/11/16 16:38, Etsuro Fujita wrote:
> On 2016/11/16 13:10, Ashutosh Bapat wrote:
>> I don't see any reason why DML/UPDATE pushdown should depend upon
>> subquery deparsing or least PHV patch. Combined together they can help
>> in more cases, but without those patches, we will be able to push-down
>> more stuff. Probably, we should just restrict push-down only for the
>> cases when above patches are not needed. That makes reviews easy. Once
>> those patches get committed, we may add more functionality depending
>> upon the status of this patch. Does that make sense?

> OK, I'll extract from the patch the minimal part that wouldn't depend on
> the two patches.

Here is a patch for that.  Todo items are: (1) add more comments and (2)
add more regression tests.  I'll do that in the next version.

Best regards,
Etsuro Fujita

Attachment

Re: Push down more UPDATEs/DELETEs in postgres_fdw

From
Rushabh Lathia
Date:
<div dir="ltr">I started reviewing the patch and here are few initial review points and questions for you.<br /><br
/>1)<br/>-static void deparseExplicitTargetList(List *tlist, List **retrieved_attrs,<br />+static void
deparseExplicitTargetList(boolis_returning,<br />+                          List *tlist,<br />+                       
 List **retrieved_attrs,<br />                           deparse_expr_cxt *context);<br /><br />Any particular reason
ofinserting new argument as 1st argunment? In general<br />we add new flags at the end or before the out param for the
existingfunction.<br /><br />2) <br />-static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,<br />-   
               RelOptInfo *joinrel, bool use_alias, List **params_list);<br />+static void
deparseFromExprForRel(StringInfobuf, PlannerInfo *root, RelOptInfo *foreignrel,<br />+                      bool
use_alias,Index target_rel, List **params_list);<br /><br /><br />Going beyond 80 character length<br /><br />3)<br
/> staticvoid<br />-deparseExplicitTargetList(List *tlist, List **retrieved_attrs,<br />+deparseExplicitTargetList(bool
is_returning,<br/>+                          List *tlist,<br />+                          List **retrieved_attrs,<br
/>                          deparse_expr_cxt *context)<br /><br />This looks bit wired to me - as comment for the
deparseExplicitTargetList()<br/>function name and comments says different then what its trying to do when<br
/>is_returningflag is passed. Either function comment need to be change or<br />may be separate function fo deparse
returninglist for join relation?<br /><br />Is it possible to teach deparseReturningList() to deparse the returning<br
/>listfor the joinrel?<br /><br />4)<br /><br />+    if (foreignrel->reloptkind == RELOPT_JOINREL)<br />+    {<br
/>+       List       *rlist = NIL;<br />+<br />+        /* Create a RETURNING list */<br />+        rlist =
make_explicit_returning_list(rtindex,rel,<br />+                                             returningList,<br />+   
                                        fdw_scan_tlist);<br />+<br />+        deparseExplicitReturningList(rlist,
retrieved_attrs,&context);<br />+    }<br />+    else<br />+        deparseReturningList(buf, root, rtindex, rel,
false,<br/>+                             returningList, retrieved_attrs);<br /><br />The code will be more centralized
ifwe add the logic of extracting returninglist<br />for JOINREL inside deparseReturningList() only, isn't it?<br /><br
/>5)make_explicit_returning_list() pull the var list from the returningList and<br />build the targetentry for the
returninglist and at the end rewrites the<br />fdw_scan_tlist. <br /><br />AFAIK, in case of DML - which is going to
pushdownto the remote server <br />ideally fdw_scan_tlist should be same as returning list, as final output<br />for
thequery is query will be RETUNING list only. isn't that true?<br /><br />If yes, then why can't we directly replace
thefdw_scan_tlist with the returning<br />list, rather then make_explicit_returning_list()?<br /><br />Also I think
rewritingthe fdw_scan_tlist should happen into postgres_fdw.c -<br />rather then deparse.c.<br /><br />6) Test coverage
forthe returning list is missing.<br /><br /><br /></div><div class="gmail_extra"><br /><div class="gmail_quote">On
Fri,Nov 18, 2016 at 8:56 AM, Etsuro Fujita <span dir="ltr"><<a href="mailto:fujita.etsuro@lab.ntt.co.jp"
target="_blank">fujita.etsuro@lab.ntt.co.jp</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0 0
0.8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 2016/11/16 16:38, Etsuro Fujita wrote:<br
/></span><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span
class="">On 2016/11/16 13:10, Ashutosh Bapat wrote:<br /></span><span class=""><blockquote class="gmail_quote"
style="margin:00 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> I don't see any reason why DML/UPDATE pushdown
shoulddepend upon<br /> subquery deparsing or least PHV patch. Combined together they can help<br /> in more cases, but
withoutthose patches, we will be able to push-down<br /> more stuff. Probably, we should just restrict push-down only
forthe<br /> cases when above patches are not needed. That makes reviews easy. Once<br /> those patches get committed,
wemay add more functionality depending<br /> upon the status of this patch. Does that make sense?<br
/></blockquote></span></blockquote><spanclass=""><br /><blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px#ccc solid;padding-left:1ex"> OK, I'll extract from the patch the minimal part that wouldn't depend
on<br/> the two patches.<br /></blockquote><br /></span> Here is a patch for that.  Todo items are: (1) add more
commentsand (2) add more regression tests.  I'll do that in the next version.<br /><br /> Best regards,<br /> Etsuro
Fujita<br/></blockquote></div><br /><br clear="all" /><br />-- <br /><div class="gmail_signature"
data-smartmail="gmail_signature">RushabhLathia</div></div> 

Re: Push down more UPDATEs/DELETEs in postgres_fdw

From
Etsuro Fujita
Date:
Hi Rushabh,

On 2016/11/22 19:05, Rushabh Lathia wrote:
> I started reviewing the patch and here are few initial review points and
> questions for you.

Thanks for the review!

> 1)
> -static void deparseExplicitTargetList(List *tlist, List **retrieved_attrs,
> +static void deparseExplicitTargetList(bool is_returning,
> +                          List *tlist,
> +                          List **retrieved_attrs,
>                            deparse_expr_cxt *context);
>
> Any particular reason of inserting new argument as 1st argunment? In general
> we add new flags at the end or before the out param for the existing
> function.

No, I don't.  I think the *context should be the last argument as in 
other functions in deparse.c.  How about inserting that before the out 
param **retrieved_attrs, like this?

static void
deparseExplicitTargetList(List *tlist,                          bool is_returning,                          List
**retrieved_attrs,                         deparse_expr_cxt *context);
 

> 2)
> -static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
> -                    RelOptInfo *joinrel, bool use_alias, List
> **params_list);
> +static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
> RelOptInfo *foreignrel,
> +                      bool use_alias, Index target_rel, List
> **params_list);

> Going beyond 80 character length

Will fix.

> 3)
>  static void
> -deparseExplicitTargetList(List *tlist, List **retrieved_attrs,
> +deparseExplicitTargetList(bool is_returning,
> +                          List *tlist,
> +                          List **retrieved_attrs,
>                            deparse_expr_cxt *context)
>
> This looks bit wired to me - as comment for the deparseExplicitTargetList()
> function name and comments says different then what its trying to do when
> is_returning flag is passed. Either function comment need to be change or
> may be separate function fo deparse returning list for join relation?
>
> Is it possible to teach deparseReturningList() to deparse the returning
> list for the joinrel?

I don't think it's possible to do that because deparseReturningList uses 
deparseTargetList internally, which only allows us to emit a target list 
for a baserel.  For example, deparseReturningList can't handle a 
returning list that contains join outputs like this:

postgres=# explain verbose delete from ft1 using ft2 where ft1.a = ft2.a 
returning ft1.*, ft2.*;                                                       QUERY PLAN

------------------------------------------------------------------------------------------------------------------------
Deleteon public.ft1  (cost=100.00..102.06 rows=1 width=46)   Output: ft1.a, ft1.b, ft2.a, ft2.b   ->  Foreign Delete
(cost=100.00..102.06rows=1 width=46)         Remote SQL: DELETE FROM public.t1 r1 USING public.t2 r2 WHERE 
 
((r1.a = r2.a)) RETURNING r1.a, r1.b, r2.a, r2.b
(4 rows)

I'd like to change the comment for deparseExplicitTargetList.

> 4)
>
> +    if (foreignrel->reloptkind == RELOPT_JOINREL)
> +    {
> +        List       *rlist = NIL;
> +
> +        /* Create a RETURNING list */
> +        rlist = make_explicit_returning_list(rtindex, rel,
> +                                             returningList,
> +                                             fdw_scan_tlist);
> +
> +        deparseExplicitReturningList(rlist, retrieved_attrs, &context);
> +    }
> +    else
> +        deparseReturningList(buf, root, rtindex, rel, false,
> +                             returningList, retrieved_attrs);
>
> The code will be more centralized if we add the logic of extracting
> returninglist
> for JOINREL inside deparseReturningList() only, isn't it?

You are right.  Will do.

> 5) make_explicit_returning_list() pull the var list from the
> returningList and
> build the targetentry for the returning list and at the end rewrites the
> fdw_scan_tlist.
>
> AFAIK, in case of DML - which is going to pushdown to the remote server
> ideally fdw_scan_tlist should be same as returning list, as final output
> for the query is query will be RETUNING list only. isn't that true?

That would be true.  But the fdw_scan_tlist passed from the core would 
contain junk columns inserted by the rewriter and planner work, such as 
CTID for the target table and whole-row Vars for other tables specified 
in the FROM clause of an UPDATE or the USING clause of a DELETE.  So, I 
created the patch so that the pushed-down UPDATE/DELETE retrieves only 
the data needed for the RETURNING calculation from the remote server, 
not the whole fdw_scan_tlist data.

> If yes, then why can't we directly replace the fdw_scan_tlist with the
> returning
> list, rather then make_explicit_returning_list()?

I think we could do that if we modified the core (maybe the executor 
part).  I'm not sure that's a good idea, though.

> Also I think rewriting the fdw_scan_tlist should happen into
> postgres_fdw.c -
> rather then deparse.c.

Will try.

> 6) Test coverage for the returning list is missing.

Will add.

Best regards,
Etsuro Fujita





Re: Push down more UPDATEs/DELETEs in postgres_fdw

From
Rushabh Lathia
Date:


On Tue, Nov 22, 2016 at 6:24 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
Hi Rushabh,

On 2016/11/22 19:05, Rushabh Lathia wrote:
I started reviewing the patch and here are few initial review points and
questions for you.

Thanks for the review!

1)
-static void deparseExplicitTargetList(List *tlist, List **retrieved_attrs,
+static void deparseExplicitTargetList(bool is_returning,
+                          List *tlist,
+                          List **retrieved_attrs,
                           deparse_expr_cxt *context);

Any particular reason of inserting new argument as 1st argunment? In general
we add new flags at the end or before the out param for the existing
function.

No, I don't.  I think the *context should be the last argument as in other functions in deparse.c.  How about inserting that before the out param **retrieved_attrs, like this?

static void
deparseExplicitTargetList(List *tlist,
                          bool is_returning,
                          List **retrieved_attrs,
                          deparse_expr_cxt *context);


Yes, adding it before retrieved_attrs would be good.
 
2)
-static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
-                    RelOptInfo *joinrel, bool use_alias, List
**params_list);
+static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
RelOptInfo *foreignrel,
+                      bool use_alias, Index target_rel, List
**params_list);

Going beyond 80 character length

Will fix.

3)
 static void
-deparseExplicitTargetList(List *tlist, List **retrieved_attrs,
+deparseExplicitTargetList(bool is_returning,
+                          List *tlist,
+                          List **retrieved_attrs,
                           deparse_expr_cxt *context)

This looks bit wired to me - as comment for the deparseExplicitTargetList()
function name and comments says different then what its trying to do when
is_returning flag is passed. Either function comment need to be change or
may be separate function fo deparse returning list for join relation?

Is it possible to teach deparseReturningList() to deparse the returning
list for the joinrel?

I don't think it's possible to do that because deparseReturningList uses deparseTargetList internally, which only allows us to emit a target list for a baserel.  For example, deparseReturningList can't handle a returning list that contains join outputs like this:

postgres=# explain verbose delete from ft1 using ft2 where ft1.a = ft2.a returning ft1.*, ft2.*;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Delete on public.ft1  (cost=100.00..102.06 rows=1 width=46)
   Output: ft1.a, ft1.b, ft2.a, ft2.b
   ->  Foreign Delete  (cost=100.00..102.06 rows=1 width=46)
         Remote SQL: DELETE FROM public.t1 r1 USING public.t2 r2 WHERE ((r1.a = r2.a)) RETURNING r1.a, r1.b, r2.a, r2.b
(4 rows)

I'd like to change the comment for deparseExplicitTargetList.

4)

+    if (foreignrel->reloptkind == RELOPT_JOINREL)
+    {
+        List       *rlist = NIL;
+
+        /* Create a RETURNING list */
+        rlist = make_explicit_returning_list(rtindex, rel,
+                                             returningList,
+                                             fdw_scan_tlist);
+
+        deparseExplicitReturningList(rlist, retrieved_attrs, &context);
+    }
+    else
+        deparseReturningList(buf, root, rtindex, rel, false,
+                             returningList, retrieved_attrs);

The code will be more centralized if we add the logic of extracting
returninglist
for JOINREL inside deparseReturningList() only, isn't it?

You are right.  Will do.

5) make_explicit_returning_list() pull the var list from the
returningList and
build the targetentry for the returning list and at the end rewrites the
fdw_scan_tlist.

AFAIK, in case of DML - which is going to pushdown to the remote server
ideally fdw_scan_tlist should be same as returning list, as final output
for the query is query will be RETUNING list only. isn't that true?

That would be true.  But the fdw_scan_tlist passed from the core would contain junk columns inserted by the rewriter and planner work, such as CTID for the target table and whole-row Vars for other tables specified in the FROM clause of an UPDATE or the USING clause of a DELETE.  So, I created the patch so that the pushed-down UPDATE/DELETE retrieves only the data needed for the RETURNING calculation from the remote server, not the whole fdw_scan_tlist data.

Yes, I noticed that fdw_scan_tlist have CTID for the target table and whole-raw vars for
other tables specified in the FROM clause of the DML. But I was thinking isn't it possible
to create new fdw_scan_tlist once we found that DML is direct pushable to foreign server?
I tried quickly doing that - but later its was throwing "variable not found in subplan target list"
from set_foreignscan_references(). 



If yes, then why can't we directly replace the fdw_scan_tlist with the
returning
list, rather then make_explicit_returning_list()?

I think we could do that if we modified the core (maybe the executor part).  I'm not sure that's a good idea, though.

Yes modifying core is not good idea. But just want to understand why you
think that this need need to modify core? 
 


Also I think rewriting the fdw_scan_tlist should happen into
postgres_fdw.c -
rather then deparse.c.

Will try.

That would be good.
 


6) Test coverage for the returning list is missing.

Will add.


Thanks.
 
Best regards,
Etsuro Fujita





--
Rushabh Lathia

Re: Push down more UPDATEs/DELETEs in postgres_fdw

From
Etsuro Fujita
Date:
On 2016/11/23 20:28, Rushabh Lathia wrote:
> On Tue, Nov 22, 2016 at 6:24 PM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:

>         1)
>         -static void deparseExplicitTargetList(List *tlist, List
>         **retrieved_attrs,
>         +static void deparseExplicitTargetList(bool is_returning,
>         +                          List *tlist,
>         +                          List **retrieved_attrs,
>                                    deparse_expr_cxt *context);
>
>         Any particular reason of inserting new argument as 1st
>         argunment? In general
>         we add new flags at the end or before the out param for the existing
>         function.

>     No, I don't.  I think the *context should be the last argument as in
>     other functions in deparse.c.  How about inserting that before the
>     out param **retrieved_attrs, like this?
>
>     static void
>     deparseExplicitTargetList(List *tlist,
>                               bool is_returning,
>                               List **retrieved_attrs,
>                               deparse_expr_cxt *context);

> Yes, adding it before retrieved_attrs would be good.

OK, will do.

>         5) make_explicit_returning_list() pull the var list from the
>         returningList and
>         build the targetentry for the returning list and at the end
>         rewrites the
>         fdw_scan_tlist.
>
>         AFAIK, in case of DML - which is going to pushdown to the remote
>         server
>         ideally fdw_scan_tlist should be same as returning list, as
>         final output
>         for the query is query will be RETUNING list only. isn't that true?

>     That would be true.  But the fdw_scan_tlist passed from the core
>     would contain junk columns inserted by the rewriter and planner
>     work, such as CTID for the target table and whole-row Vars for other
>     tables specified in the FROM clause of an UPDATE or the USING clause
>     of a DELETE.  So, I created the patch so that the pushed-down
>     UPDATE/DELETE retrieves only the data needed for the RETURNING
>     calculation from the remote server, not the whole fdw_scan_tlist data.

> Yes, I noticed that fdw_scan_tlist have CTID for the target table and
> whole-raw vars for
> other tables specified in the FROM clause of the DML. But I was thinking
> isn't it possible
> to create new fdw_scan_tlist once we found that DML is direct pushable
> to foreign server?
> I tried quickly doing that - but later its was throwing "variable not
> found in subplan target list"
> from set_foreignscan_references().

>         If yes, then why can't we directly replace the fdw_scan_tlist
>         with the
>         returning
>         list, rather then make_explicit_returning_list()?

>     I think we could do that if we modified the core (maybe the executor
>     part).  I'm not sure that's a good idea, though.

> Yes modifying core is not good idea. But just want to understand why you
> think that this need need to modify core?

Sorry, I don't remember that.  Will check.

I'd like to move this to the next CF.

Thank you for the comments!

Best regards,
Etsuro Fujita





Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw

From
Etsuro Fujita
Date:
On 2016/11/30 17:29, Etsuro Fujita wrote:
> On 2016/11/23 20:28, Rushabh Lathia wrote:

I wrote:
>>     How about inserting that before the
>>     out param **retrieved_attrs, like this?
>>
>>     static void
>>     deparseExplicitTargetList(List *tlist,
>>                               bool is_returning,
>>                               List **retrieved_attrs,
>>                               deparse_expr_cxt *context);

>> Yes, adding it before retrieved_attrs would be good.

> OK, will do.

Done.

You wrote:
>>         5) make_explicit_returning_list() pull the var list from the
>>         returningList and
>>         build the targetentry for the returning list and at the end
>>         rewrites the
>>         fdw_scan_tlist.
>>
>>         AFAIK, in case of DML - which is going to pushdown to the remote
>>         server
>>         ideally fdw_scan_tlist should be same as returning list, as
>>         final output
>>         for the query is query will be RETUNING list only. isn't that
>> true?

I wrote:
>>     That would be true.  But the fdw_scan_tlist passed from the core
>>     would contain junk columns inserted by the rewriter and planner
>>     work, such as CTID for the target table and whole-row Vars for other
>>     tables specified in the FROM clause of an UPDATE or the USING clause
>>     of a DELETE.  So, I created the patch so that the pushed-down
>>     UPDATE/DELETE retrieves only the data needed for the RETURNING
>>     calculation from the remote server, not the whole fdw_scan_tlist
>> data.

>> Yes, I noticed that fdw_scan_tlist have CTID for the target table and
>> whole-raw vars for
>> other tables specified in the FROM clause of the DML. But I was thinking
>> isn't it possible
>> to create new fdw_scan_tlist once we found that DML is direct pushable
>> to foreign server?
>> I tried quickly doing that - but later its was throwing "variable not
>> found in subplan target list"
>> from set_foreignscan_references().

We could probably avoid that error by replacing the targetlist of the 
subplan with fdw_scan_tlist, but that wouldn't be enough ...

You wrote:
>>         If yes, then why can't we directly replace the fdw_scan_tlist
>>         with the
>>         returning
>>         list, rather then make_explicit_returning_list()?

I wrote:
>>     I think we could do that if we modified the core (maybe the executor
>>     part).  I'm not sure that's a good idea, though.

>> Yes modifying core is not good idea. But just want to understand why you
>> think that this need need to modify core?

> Sorry, I don't remember that.  Will check.

The reason why I think so is that the ModifyTable node on top of the 
ForeignScan node requires that the targetlist of the ForeignScan has (1) 
junk whole-row Vars for secondary relations in UPDATE/DELETE and (2) all 
attributes of the target relation to produce the new tuple for UPDATE. 
(So, it wouldn't be enough to just replace the ForeignScan's targetlist 
with fdw_scan_tlist!)  For #1, see this (and the following code) in 
ExecInitModifyTable:

     /*
      * If we have any secondary relations in an UPDATE or DELETE, they 
need to
      * be treated like non-locked relations in SELECT FOR UPDATE, ie, the
      * EvalPlanQual mechanism needs to be told about them.  Locate the
      * relevant ExecRowMarks.
      */

And for #2, see this (and the following code, especially where calling 
ExecCheckPlanOutput) in the same function:

      * This section of code is also a convenient place to verify that the
      * output of an INSERT or UPDATE matches the target table(s).

What you proposed would be a good idea because the FDW could calculate 
the user-query RETURNING list more efficiently in some cases, but I'd 
like to leave that for future work.

Attached is the new version of the patch.  I also addressed other 
comments from you: moved rewriting the fdw_scan_tlist to postgres_fdw.c, 
added/revised comments, and added regression tests for the case where a 
pushed down UPDATE/DELETE on a join has RETURNING.

My apologies for having been late to work on this.

Best regards,
Etsuro Fujita

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw

From
Michael Paquier
Date:
On Wed, Jan 25, 2017 at 7:20 PM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:
> Attached is the new version of the patch.  I also addressed other comments
> from you: moved rewriting the fdw_scan_tlist to postgres_fdw.c,
> added/revised comments, and added regression tests for the case where a
> pushed down UPDATE/DELETE on a join has RETURNING.
>
> My apologies for having been late to work on this.

Moved to CF 2017-03.
-- 
Michael



Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw

From
Rushabh Lathia
Date:
Sorry for delay in the review.

I started reviewing the patch again. Patch applied cleanly on latest source
as well as regression pass through with the patch. I also performed
few manual testing and haven't found any regression. Patch look
much cleaner the earlier version, and don't have any major concern as
such. Here are few comments:

1)

@@ -211,6 +211,12 @@ typedef struct PgFdwDirectModifyState
     PGresult   *result;            /* result for query */
     int            num_tuples;        /* # of result tuples */
     int            next_tuple;        /* index of next one to return */
+    Relation    resultRel;        /* relcache entry for the target table */


Why we need resultRel? Can't we directly use dmstate->rel ?


2) In the patch somewhere scanrelid condition being used as
fscan->scan.scanrelid == 0 where as some place its been used as
fsplan->scan.scanrelid > 0. Infact in the same function its been used
differently example postgresBeginDirectModify. Can make this consistent.

3)

+     * If UPDATE/DELETE on a join, create a RETURINING list used in the remote
+     * query.
+     */
+    if (fscan->scan.scanrelid == 0)
+        returningList = make_explicit_returning_list(resultRelation, rel,
+                                                     returningList);
+

Above block can be moved inside the if (plan->returningLists) condition above
the block. Like this:

    /*
     * Extract the relevant RETURNING list if any.
     */
    if (plan->returningLists)
    {
        returningList = (List *) list_nth(plan->returningLists, subplan_index);

        /*
         * If UPDATE/DELETE on a join, create a RETURINING list used in the remote
         * query.
         */
        if (fscan->scan.scanrelid == 0)
            returningList = make_explicit_returning_list(resultRelation,
                                                         rel,
                                                         returningList);
    }


I am still doing few more testing with the patch, if I will found anything apart from
this I will raise that into another mail.

Thanks,



On Wed, Feb 1, 2017 at 11:50 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Wed, Jan 25, 2017 at 7:20 PM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:
> Attached is the new version of the patch.  I also addressed other comments
> from you: moved rewriting the fdw_scan_tlist to postgres_fdw.c,
> added/revised comments, and added regression tests for the case where a
> pushed down UPDATE/DELETE on a join has RETURNING.
>
> My apologies for having been late to work on this.

Moved to CF 2017-03.
--
Michael



--
Rushabh Lathia

Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw

From
Etsuro Fujita
Date:
On 2017/02/13 18:24, Rushabh Lathia wrote:
> I started reviewing the patch again. Patch applied cleanly on latest source
> as well as regression pass through with the patch. I also performed
> few manual testing and haven't found any regression. Patch look
> much cleaner the earlier version, and don't have any major concern as
> such.

Thanks for the review!

> Here are few comments:
>
> 1)
>
> @@ -211,6 +211,12 @@ typedef struct PgFdwDirectModifyState
>      PGresult   *result;            /* result for query */
>      int            num_tuples;        /* # of result tuples */
>      int            next_tuple;        /* index of next one to return */
> +    Relation    resultRel;        /* relcache entry for the target table */

> Why we need resultRel? Can't we directly use dmstate->rel ?

The reason why we need that is because in get_returning_data, we pass 
dmstate->rel to make_tuple_from_result_row, which requires that 
dmstate->rel be NULL when the scan tuple is described by fdw_scan_tlist. 
  So in that case we set dmstate->rel to NULL and have 
dmstate->resultRel that is the relcache entry for the target relation in 
postgresBeginDirectModify.

> 2) In the patch somewhere scanrelid condition being used as
> fscan->scan.scanrelid == 0 where as some place its been used as
> fsplan->scan.scanrelid > 0. Infact in the same function its been used
> differently example postgresBeginDirectModify. Can make this consistent.

Ok, done.

> 3)
>
> +     * If UPDATE/DELETE on a join, create a RETURINING list used in the
> remote
> +     * query.
> +     */
> +    if (fscan->scan.scanrelid == 0)
> +        returningList = make_explicit_returning_list(resultRelation, rel,
> +                                                     returningList);
> +
>
> Above block can be moved inside the if (plan->returningLists) condition
> above
> the block. Like this:
>
>     /*
>      * Extract the relevant RETURNING list if any.
>      */
>     if (plan->returningLists)
>     {
>         returningList = (List *) list_nth(plan->returningLists,
> subplan_index);
>
>         /*
>          * If UPDATE/DELETE on a join, create a RETURINING list used in
> the remote
>          * query.
>          */
>         if (fscan->scan.scanrelid == 0)
>             returningList = make_explicit_returning_list(resultRelation,
>                                                          rel,
>                                                          returningList);
>     }

Done that way.

Another thing I noticed is duplicate work in apply_returning_filter; it 
initializes tableoid of an updated/deleted tuple if needed, but the core 
will do that (see ExecProcessReturning).  I removed that work from 
apply_returning_filter.

> I am still doing few more testing with the patch, if I will found
> anything apart from
> this I will raise that into another mail.

Thanks again!

Attached is an updated version of the patch.

Best regards,
Etsuro Fujita

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw

From
Rushabh Lathia
Date:


On Mon, Feb 20, 2017 at 1:41 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
On 2017/02/13 18:24, Rushabh Lathia wrote:
I started reviewing the patch again. Patch applied cleanly on latest source
as well as regression pass through with the patch. I also performed
few manual testing and haven't found any regression. Patch look
much cleaner the earlier version, and don't have any major concern as
such.

Thanks for the review!

Here are few comments:

1)

@@ -211,6 +211,12 @@ typedef struct PgFdwDirectModifyState
     PGresult   *result;            /* result for query */
     int            num_tuples;        /* # of result tuples */
     int            next_tuple;        /* index of next one to return */
+    Relation    resultRel;        /* relcache entry for the target table */

Why we need resultRel? Can't we directly use dmstate->rel ?

The reason why we need that is because in get_returning_data, we pass dmstate->rel to make_tuple_from_result_row, which requires that dmstate->rel be NULL when the scan tuple is described by fdw_scan_tlist.  So in that case we set dmstate->rel to NULL and have dmstate->resultRel that is the relcache entry for the target relation in postgresBeginDirectModify.


Thanks for the explanation. We might do something here by using fdw_scan_tlist or changing the assumption of make_tuple_from_result_row(), and that way we can avoid two similar variable pointer in the PgFdwDirectModifyState.

I am okay with currently also, but it adding a note somewhere about this would be great. Also let keep this point open for the committer, if committer feel this is good then lets go ahead with this.

Here are few other cosmetic changes:

1)

+ *
+ * 'target_rel' is either zero or the rangetable index of a target relation.
+ * In the latter case this construncts FROM clause of UPDATE or USING clause
+ * of DELETE by simply ignoring the target relation while deparsing the given

Spell correction: - construncts

2)

+        /*
+         * If either input is the target relation, get all the joinclauses.
+         * Otherwise extract conditions mentioning the target relation from
+         * the joinclauses.
+         */


space between joinclauses needed.

3)

+        /*
+         * If UPDATE/DELETE on a join, create a RETURINING list used in the
+         * remote query.
+         */
+        if (fscan->scan.scanrelid == 0)
+            returningList = make_explicit_returning_list(resultRelation,
+                                                         rel,
+                                                         returningList);


Spell correction: RETURINING

I did above changes in the attached patch. Please have  a look once and then I feel like this patch is ready for committer.

Thanks,
Rushabh Lathia
Attachment

Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw

From
Etsuro Fujita
Date:
On 2017/02/21 19:31, Rushabh Lathia wrote:
> On Mon, Feb 20, 2017 at 1:41 PM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:

>     On 2017/02/13 18:24, Rushabh Lathia wrote:
>         Here are few comments:
>
>         1)
>
>         @@ -211,6 +211,12 @@ typedef struct PgFdwDirectModifyState
>              PGresult   *result;            /* result for query */
>              int            num_tuples;        /* # of result tuples */
>              int            next_tuple;        /* index of next one to
>         return */
>         +    Relation    resultRel;        /* relcache entry for the
>         target table */
>
>
>         Why we need resultRel? Can't we directly use dmstate->rel ?
>
>
>     The reason why we need that is because in get_returning_data, we
>     pass dmstate->rel to make_tuple_from_result_row, which requires that
>     dmstate->rel be NULL when the scan tuple is described by
>     fdw_scan_tlist.  So in that case we set dmstate->rel to NULL and
>     have dmstate->resultRel that is the relcache entry for the target
>     relation in postgresBeginDirectModify.

> Thanks for the explanation. We might do something here by using
> fdw_scan_tlist or changing the assumption of
> make_tuple_from_result_row(), and that way we can avoid two similar
> variable pointer in the PgFdwDirectModifyState.

I agree that the two similar variables are annoying, to some extent, but 
ISTM that is not that bad because that makes the handling of 
dmstate->rel consistent with that of PgFdwScanState's rel.  As you know, 
PgFdwDirectModifyState is defined in a similar way as PgFdwScanState, in 
which the rel is a relcache entry for the foreign table for a simple 
foreign table scan and NULL for a foreign join scan (see comments for 
the definition of PgFdwScanState).

> I am okay with currently also, but it adding a note somewhere about this
> would be great. Also let keep this point open for the committer, if
> committer feel this is good then lets go ahead with this.

Agreed.

> Here are few other cosmetic changes:
>
> 1)
>
> + *
> + * 'target_rel' is either zero or the rangetable index of a target
> relation.
> + * In the latter case this construncts FROM clause of UPDATE or USING
> clause
> + * of DELETE by simply ignoring the target relation while deparsing the
> given
>
> Spell correction: - construncts
>
> 2)
>
> +        /*
> +         * If either input is the target relation, get all the joinclauses.
> +         * Otherwise extract conditions mentioning the target relation from
> +         * the joinclauses.
> +         */
>
>
> space between joinclauses needed.
>
> 3)
>
> +        /*
> +         * If UPDATE/DELETE on a join, create a RETURINING list used in the
> +         * remote query.
> +         */
> +        if (fscan->scan.scanrelid == 0)
> +            returningList = make_explicit_returning_list(resultRelation,
> +                                                         rel,
> +                                                         returningList);
>
>
> Spell correction: RETURINING

Good catch!

> I did above changes in the attached patch. Please have  a look once and

I'm fine with that.  Thanks for the patch!

> then I feel like this patch is ready for committer.

Thanks for reviewing!

Best regards,
Etsuro Fujita





Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw

From
Rushabh Lathia
Date:


On Wed, Feb 22, 2017 at 12:15 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
On 2017/02/21 19:31, Rushabh Lathia wrote:
On Mon, Feb 20, 2017 at 1:41 PM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:

    On 2017/02/13 18:24, Rushabh Lathia wrote:
        Here are few comments:

        1)

        @@ -211,6 +211,12 @@ typedef struct PgFdwDirectModifyState
             PGresult   *result;            /* result for query */
             int            num_tuples;        /* # of result tuples */
             int            next_tuple;        /* index of next one to
        return */
        +    Relation    resultRel;        /* relcache entry for the
        target table */


        Why we need resultRel? Can't we directly use dmstate->rel ?


    The reason why we need that is because in get_returning_data, we
    pass dmstate->rel to make_tuple_from_result_row, which requires that
    dmstate->rel be NULL when the scan tuple is described by
    fdw_scan_tlist.  So in that case we set dmstate->rel to NULL and
    have dmstate->resultRel that is the relcache entry for the target
    relation in postgresBeginDirectModify.

Thanks for the explanation. We might do something here by using
fdw_scan_tlist or changing the assumption of
make_tuple_from_result_row(), and that way we can avoid two similar
variable pointer in the PgFdwDirectModifyState.

I agree that the two similar variables are annoying, to some extent, but ISTM that is not that bad because that makes the handling of dmstate->rel consistent with that of PgFdwScanState's rel.  As you know, PgFdwDirectModifyState is defined in a similar way as PgFdwScanState, in which the rel is a relcache entry for the foreign table for a simple foreign table scan and NULL for a foreign join scan (see comments for the definition of PgFdwScanState).

I am okay with currently also, but it adding a note somewhere about this
would be great. Also let keep this point open for the committer, if
committer feel this is good then lets go ahead with this.

Agreed.


Thanks.

Marked this as Ready for Committer.


Here are few other cosmetic changes:

1)

+ *
+ * 'target_rel' is either zero or the rangetable index of a target
relation.
+ * In the latter case this construncts FROM clause of UPDATE or USING
clause
+ * of DELETE by simply ignoring the target relation while deparsing the
given

Spell correction: - construncts

2)

+        /*
+         * If either input is the target relation, get all the joinclauses.
+         * Otherwise extract conditions mentioning the target relation from
+         * the joinclauses.
+         */


space between joinclauses needed.

3)

+        /*
+         * If UPDATE/DELETE on a join, create a RETURINING list used in the
+         * remote query.
+         */
+        if (fscan->scan.scanrelid == 0)
+            returningList = make_explicit_returning_list(resultRelation,
+                                                         rel,
+                                                         returningList);


Spell correction: RETURINING

Good catch!

I did above changes in the attached patch. Please have  a look once and

I'm fine with that.  Thanks for the patch!

then I feel like this patch is ready for committer.

Thanks for reviewing!

Best regards,
Etsuro Fujita





--
Rushabh Lathia

Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw

From
Etsuro Fujita
Date:
On 2017/02/22 19:57, Rushabh Lathia wrote:
> Marked this as Ready for Committer.

I noticed that this item in the CF app was incorrectly marked as 
Committed.  This patch isn't committed, so I returned it to the previous 
status.  I also rebased the patch.  Attached is a new version of the patch.

Best regards,
Etsuro Fujita

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw

From
Robert Haas
Date:
On Wed, Mar 22, 2017 at 6:20 AM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:
> On 2017/02/22 19:57, Rushabh Lathia wrote:
>> Marked this as Ready for Committer.
>
> I noticed that this item in the CF app was incorrectly marked as Committed.
> This patch isn't committed, so I returned it to the previous status.  I also
> rebased the patch.  Attached is a new version of the patch.

Sorry, I marked the wrong patch as committed.  Apologies for that.

This doesn't apply any more because of recent changes.

git diff --check complains:
contrib/postgres_fdw/postgres_fdw.c:3653: space before tab in indent.

+        /* Shouldn't contain the target relation. */
+        Assert(target_rel == 0);

This comment should give a reason.
voiddeparseDirectUpdateSql(StringInfo buf, PlannerInfo *root,                       Index rtindex, Relation rel,
+                       RelOptInfo *foreignrel,                       List *targetlist,                       List
*targetAttrs,                      List *remote_conds,
 

Could you add a comment explaining the meaning of these various
arguments?  It takes rtindex, rel, and foreignrel, which apparently
are all different things, but the meaning is not explained.
/*
+ * Add a RETURNING clause, if needed, to an UPDATE/DELETE on a join.
+ */
+static void
+deparseExplicitReturningList(List *rlist,
+                             List **retrieved_attrs,
+                             deparse_expr_cxt *context)
+{
+    deparseExplicitTargetList(rlist, true, retrieved_attrs, context);
+}

Do we really want to add a function for one line of code?

+/*
+ * Look for conditions mentioning the target relation in the given join tree,
+ * which will be pulled up into the WHERE clause.  Note that this is safe due
+ * to the same reason stated in comments in deparseFromExprForRel.
+ */

The comments for deparseFromExprForRel do not seem to address the
topic of why this is safe.  Also, the answer to the question "safe
from what?" is not clear.

-    deparseReturningList(buf, root, rtindex, rel, false,
-                         returningList, retrieved_attrs);
+    if (foreignrel->reloptkind == RELOPT_JOINREL)
+        deparseExplicitReturningList(returningList, retrieved_attrs, &context);
+    else
+        deparseReturningList(buf, root, rtindex, rel, false,
+                             returningList, retrieved_attrs);

Why do these cases need to be handled differently?  Maybe add a brief comment?

+        if ((outerrel->reloptkind == RELOPT_BASEREL &&
+             outerrel->relid == target_rel) ||
+            (innerrel->reloptkind == RELOPT_BASEREL &&
+             innerrel->relid == target_rel))

1. Surely it's redundant to check the RelOptKind if the RTI matches?

2. Generally, the tests in this patch against various RelOptKind
values should be adapted to use the new macros introduced in
7a39b5e4d11229ece930a51fd7cb29e535db4494.

The regression tests remove every remaining case where an update or
delete gets fails to get pushed to the remote side.  I think we should
still test that path, because we've still got that code.  Maybe use a
non-pushable function in the join clause, or something.

The new test cases could use some brief comments explaining their purpose.
    if (plan->returningLists)
+    {        returningList = (List *) list_nth(plan->returningLists, subplan_index);

+        /*
+         * If UPDATE/DELETE on a join, create a RETURNING list used in the
+         * remote query.
+         */
+        if (fscan->scan.scanrelid == 0)
+            returningList = make_explicit_returning_list(resultRelation,
+                                                         rel,
+                                                         returningList);
+    }

Again, the comment doesn't really explain why we're doing this.  And
initializing returningList twice in a row seems strange, too.

I am unfortunately too tired to finish properly reviewing this tonight.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw

From
David Steele
Date:
On 3/22/17 6:20 AM, Etsuro Fujita wrote:
> On 2017/02/22 19:57, Rushabh Lathia wrote:
>> Marked this as Ready for Committer.
> 
> I noticed that this item in the CF app was incorrectly marked as
> Committed.  This patch isn't committed, so I returned it to the previous
> status.  I also rebased the patch.  Attached is a new version of the patch.

This submission has been moved to CF 2017-07.

-- 
-David
david@pgmasters.net



Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw

From
Daniel Gustafsson
Date:
> On 08 Apr 2017, at 16:14, David Steele <david@pgmasters.net> wrote:
>
> On 3/22/17 6:20 AM, Etsuro Fujita wrote:
>> On 2017/02/22 19:57, Rushabh Lathia wrote:
>>> Marked this as Ready for Committer.
>>
>> I noticed that this item in the CF app was incorrectly marked as
>> Committed.  This patch isn't committed, so I returned it to the previous
>> status.  I also rebased the patch.  Attached is a new version of the patch.
>
> This submission has been moved to CF 2017-07.

This patch has been marked Ready for Committer in the current commitfest
without being committed or rejected.  Moving to the next commitfest, but since
it has bitrotted I’m moving it to Waiting for author.

cheers ./daniel

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw

From
Michael Paquier
Date:
On Mon, Oct 2, 2017 at 9:08 AM, Daniel Gustafsson <daniel@yesql.se> wrote:
> This patch has been marked Ready for Committer in the current commitfest
> without being committed or rejected.  Moving to the next commitfest, but since
> it has bitrotted I’m moving it to Waiting for author.

No updates have showed up since, so returned with feedback. Please
feel free to send a new version if you can hack it.
--
Michael