Re: MERGE ... RETURNING - Mailing list pgsql-hackers
From | Gurjeet Singh |
---|---|
Subject | Re: MERGE ... RETURNING |
Date | |
Msg-id | CABwTF4V2H9enRQQu5qCxhSOF49Z8ODMOcU2P6e5uSWGWTE-N5A@mail.gmail.com Whole thread Raw |
In response to | Re: MERGE ... RETURNING (Dean Rasheed <dean.a.rasheed@gmail.com>) |
Responses |
Re: MERGE ... RETURNING
Re: MERGE ... RETURNING |
List | pgsql-hackers |
On Fri, Jul 7, 2023 at 3:48 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > On Thu, 6 Jul 2023 at 06:13, Gurjeet Singh <gurjeet@singh.im> wrote: > > > > > One minor annoyance is that, due to the way that pg_merge_action() and > > > pg_merge_when_clause() require access to the MergeActionState, they > > > only work if they appear directly in the RETURNING list. They can't, > > > for example, appear in a subquery in the RETURNING list, and I don't > > > see an easy way round that limitation. > > > > I believe that's a serious limitation, and would be a blocker for the feature. > > Yes, that was bugging me for quite a while. > > Attached is a new version that removes that restriction, allowing the > merge support functions to appear anywhere. This requires a bit of > planner support, to deal with merge support functions in subqueries, > in a similar way to how aggregates and GROUPING() expressions are > handled. But a number of the changes from the previous patch are no > longer necessary, so overall, this version of the patch is slightly > smaller. +1 > > I think the name of function pg_merge_when_clause() can be improved. > > How about pg_merge_when_clause_ordinal(). > > > > That's a bit of a mouthful, but I don't have a better idea right now. > I've left the names alone for now, in case something better occurs to > anyone. +1. How do we make sure we don't forget that it needs to be named better. Perhaps a TODO item within the patch will help. > > In the doc page of MERGE, you've moved the 'with_query' from the > > bottom of Parameters section to the top of that section. Any reason > > for this? Since the Parameters section is quite large, for a moment I > > thought the patch was _adding_ the description of 'with_query'. > > > > Ah yes, I was just making the order consistent with the > INSERT/UPDATE/DELETE pages. That could probably be committed > separately. I don't think that's necessary, if it improves consistency with related docs. > > + /* > > + * Merge support functions should only be called directly from a MERGE > > + * command, and need access to the parent ModifyTableState. The parser > > + * should have checked that such functions only appear in the RETURNING > > + * list of a MERGE, so this should never fail. > > + */ > > + if (IsMergeSupportFunction(funcid)) > > + { > > + if (!state->parent || > > + !IsA(state->parent, ModifyTableState) || > > + ((ModifyTableState *) state->parent)->operation != CMD_MERGE) > > + elog(ERROR, "merge support function called in non-merge context"); > > > > As the comment says, this is an unexpected condition, and should have > > been caught and reported by the parser. So I think it'd be better to > > use an Assert() here. Moreover, there's ERROR being raised by the > > pg_merge_action() and pg_merge_when_clause() functions, when they > > detect the function context is not appropriate. > > > > I found it very innovative to allow these functions to be called only > > in a certain part of certain SQL command. I don't think there's a > > precedence for this in Postgres; I'd be glad to learn if there are > > other functions that Postgres exposes this way. > > > > - EXPR_KIND_RETURNING, /* RETURNING */ > > + EXPR_KIND_RETURNING, /* RETURNING in INSERT/UPDATE/DELETE */ > > + EXPR_KIND_MERGE_RETURNING, /* RETURNING in MERGE */ > > > > Having to invent a whole new ParseExprKind enum, feels like a bit of > > an overkill. My only objection is that this is exactly like > > EXPR_KIND_RETURNING, hence EXPR_KIND_MERGE_RETURNING needs to be dealt > > with exactly in as many places. But I don't have any alternative > > proposals. > > > > That's all gone now from the new patch, since there is no longer any > restriction on where these functions can appear. I believe this elog can be safely turned into an Assert. + switch (mergeActionCmdType) { - CmdType commandType = relaction->mas_action->commandType; .... + case CMD_INSERT: .... + default: + elog(ERROR, "unrecognized commandType: %d", (int) mergeActionCmdType); The same treatment can be applied to the elog(ERROR) in pg_merge_action(). > > +CREATE FUNCTION merge_into_sq_target(sid int, balance int, delta int, > > + OUT action text, OUT tid int, > > OUT new_balance int) > > +LANGUAGE sql AS > > +$$ > > + MERGE INTO sq_target t > > + USING (VALUES ($1, $2, $3)) AS v(sid, balance, delta) > > + ON tid = v.sid > > + WHEN MATCHED AND tid > 2 THEN > > > > Again, for consistency, the comparison operator should be >=. There > > are a few more occurrences of this comparison in the rest of the file, > > that need the same treatment. > > > > I changed the new tests to use ">= 2" (and the COPY test now returns 3 > rows, with an action of each type, which is easier to read), but I > don't think it's really necessary to change all the existing tests > from "> 2". There's nothing wrong with the "= 2" case having no > action, as long as the tests give decent coverage. I was just trying to drive these tests towards a consistent pattern. As a reader, if I see these differences, the first and the conservative thought I have is that these differences must be there for a reason, and then I have to work to find out what those reasons might be. And that's a lot of wasted effort, just in case someone intends to change something in these tests. I performed this round of review by comparing the diff between the v7 and v8 patches (after applying to commit 4f4d73466d) -ExecProcessReturning(ResultRelInfo *resultRelInfo, +ExecProcessReturning(ModifyTableContext *context, + ResultRelInfo *resultRelInfo, ... + TupleTableSlot *rslot; ... + if (context->relaction) + { ... + PG_TRY(); + { + rslot = ExecProject(projectReturning); + } + PG_FINALLY(); + { + mergeActionCmdType = saved_mergeActionCmdType; + mergeActionIdx = saved_mergeActionIdx; + } + PG_END_TRY(); + } + else + rslot = ExecProject(projectReturning); + + return rslot; In the above hunk, if there's an exception/ERROR, I believe we should PG_RE_THROW(). If there's a reason to continue, we should at least set rslot = NULL, otherwise we may be returning an uninitialized value to the caller. { oid => '9499', descr => 'command type of current MERGE action', - proname => 'pg_merge_action', provolatile => 'v', + proname => 'pg_merge_action', provolatile => 'v', proparallel => 'r', .... { oid => '9500', descr => 'index of current MERGE WHEN clause', - proname => 'pg_merge_when_clause', provolatile => 'v', + proname => 'pg_merge_when_clause', provolatile => 'v', proparallel => 'r', .... I see that you've now set proparallel of these functions to 'r'. I'd just like to understand how you got to that conclusion. --- error when using MERGE support functions outside MERGE -SELECT pg_merge_action() FROM sq_target; I believe it would be worthwhile to keep a record of the expected outputs of these invocations in the tests, just in case they change over time. Best regards, Gurjeet http://Gurje.et
pgsql-hackers by date: