postgres_fdw: perform UPDATE/DELETE .. RETURNING on a join directly - Mailing list pgsql-hackers

From Etsuro Fujita
Subject postgres_fdw: perform UPDATE/DELETE .. RETURNING on a join directly
Date
Msg-id 5A438A45.2000909@lab.ntt.co.jp
Whole thread Raw
Responses Re: postgres_fdw: perform UPDATE/DELETE .. RETURNING on a join directly  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
List pgsql-hackers
(2017/04/08 10:28), Robert Haas wrote:
> On Wed, Mar 22, 2017 at 6:20 AM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp>  wrote:
>> On 2017/02/22 19:57, Rushabh Lathia wrote:
>>> Marked this as Ready for Committer.
>>
>> I noticed that this item in the CF app was incorrectly marked as
Committed.
>> This patch isn't committed, so I returned it to the previous status.
 I also
>> rebased the patch.  Attached is a new version of the patch.
>
> Sorry, I marked the wrong patch as committed.  Apologies for that.

No problem.  My apologies for the long long delay.

> This doesn't apply any more because of recent changes.
>
> git diff --check complains:
> contrib/postgres_fdw/postgres_fdw.c:3653: space before tab in indent.

I rebased the patch.

> +        /* Shouldn't contain the target relation. */
> +        Assert(target_rel == 0);
>
> This comment should give a reason.

Done.

>   void
>   deparseDirectUpdateSql(StringInfo buf, PlannerInfo *root,
>                          Index rtindex, Relation rel,
> +                       RelOptInfo *foreignrel,
>                          List *targetlist,
>                          List *targetAttrs,
>                          List *remote_conds,
>
> Could you add a comment explaining the meaning of these various
> arguments?  It takes rtindex, rel, and foreignrel, which apparently
> are all different things, but the meaning is not explained.

Done.

>   /*
> + * Add a RETURNING clause, if needed, to an UPDATE/DELETE on a join.
> + */
> +static void
> +deparseExplicitReturningList(List *rlist,
> +                             List **retrieved_attrs,
> +                             deparse_expr_cxt *context)
> +{
> +    deparseExplicitTargetList(rlist, true, retrieved_attrs, context);
> +}
>
> Do we really want to add a function for one line of code?

I don't have any strong opinion about that, so I removed that function
and modified deparseDirectUpdateSql/deparseDirectDeleteSql so it calls
deparseExplicitTargetList directly.

> +/*
> + * Look for conditions mentioning the target relation in the given
join tree,
> + * which will be pulled up into the WHERE clause.  Note that this is
safe due
> + * to the same reason stated in comments in deparseFromExprForRel.
> + */
>
> The comments for deparseFromExprForRel do not seem to address the
> topic of why this is safe.  Also, the answer to the question "safe
> from what?" is not clear.

Maybe my explanation would not be enough, but I think the reason why
this is safe would be derived from the comments for
deparseFromExprForRel.  BUT: on reflection, I think I made the deparsing
logic complex beyond necessity.  So I simplified the logic, which
doesn't pull_up_target_conditions any more, and added comments about that.

> -    deparseReturningList(buf, root, rtindex, rel, false,
> -                         returningList, retrieved_attrs);
> +    if (foreignrel->reloptkind == RELOPT_JOINREL)
> +        deparseExplicitReturningList(returningList,
retrieved_attrs,&context);
> +    else
> +        deparseReturningList(buf, root, rtindex, rel, false,
> +                             returningList, retrieved_attrs);
>
> Why do these cases need to be handled differently?  Maybe add a brief
comment?

The reason for that is 1)

+       /*
+        * When performing an UPDATE/DELETE .. RETURNING on a join directly,
+        * we fetch from the foreign server any Vars specified in RETURNING
+        * that refer not only to the target relation but to non-target
+        * relations.  So we'll deparse them into the RETURNING clause
of the
+        * remote query;

and 2) deparseReturningList can retrieve Vars of the target relation,
but can't retrieve Vars of non-target relations.

> +        if ((outerrel->reloptkind == RELOPT_BASEREL&&
> +             outerrel->relid == target_rel) ||
> +            (innerrel->reloptkind == RELOPT_BASEREL&&
> +             innerrel->relid == target_rel))
>
> 1. Surely it's redundant to check the RelOptKind if the RTI matches?

Good catch!  Revised.

> 2. Generally, the tests in this patch against various RelOptKind
> values should be adapted to use the new macros introduced in
> 7a39b5e4d11229ece930a51fd7cb29e535db4494.

I left some of the tests alone because I think that's more strict.

> The regression tests remove every remaining case where an update or
> delete gets fails to get pushed to the remote side.  I think we should
> still test that path, because we've still got that code.  Maybe use a
> non-pushable function in the join clause, or something.

Done.

> The new test cases could use some brief comments explaining their purpose.

Done.  Also, I think some of the tests in the previous version are
redundant, so I simplified the tests a bit.

>       if (plan->returningLists)
> +    {
>           returningList = (List *) list_nth(plan->returningLists,
subplan_index);
>
> +        /*
> +         * If UPDATE/DELETE on a join, create a RETURNING list used
in the
> +         * remote query.
> +         */
> +        if (fscan->scan.scanrelid == 0)
> +            returningList = make_explicit_returning_list(resultRelation,
> +                                                         rel,
> +                                                         returningList);
> +    }
>
> Again, the comment doesn't really explain why we're doing this.  And
> initializing returningList twice in a row seems strange, too.

I don't think that's strange because the second one is actually
re-computation of that list.  Note that make_explicit_returning_list
takes that list as the 3rd argument.  I added more comments explaining
why.  (I also changed the name of make_explicit_returning_list.)

> I am unfortunately too tired to finish properly reviewing this
tonight.  :-(

Thanks for the review!

Attached is an updated version of the patch.  I'll add this to the next CF.

Sorry for creating a new thread.  I changed my mail client.

Best regards,
Etsuro Fujita

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Add hint about replication slots when nearing wraparound
Next
From: Edson Carlos Ericksson Richter
Date:
Subject: Re: Does PostgreSQL check database integrity at startup?