Re: [HACKERS] MERGE SQL Statement for PG11 - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: [HACKERS] MERGE SQL Statement for PG11
Date
Msg-id CABOikdNj+8HEJ5D8tu56mrPkjHVRrBb2_cdKWwpiYNcjXgDw8g@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] MERGE SQL Statement for PG11  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [HACKERS] MERGE SQL Statement for PG11  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Re: [HACKERS] MERGE SQL Statement for PG11  (Peter Geoghegan <pg@bowt.ie>)
Re: [HACKERS] MERGE SQL Statement for PG11  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers


On 9 March 2018 at 08:29, Peter Geoghegan <pg@bowt.ie> wrote:
On Thu, Mar 8, 2018 at 4:54 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
> Thanks for the feedback. I've greatly refactored the code based on your
> comments and I too like the resultant code. What we have now have
> essentially is: two functions:

I really like how ExecMerge() now mostly just consists of this code
(plus a lot of explanatory comments):

Thanks!
 
 My #1 concern has become RLS, and
perhaps only because I haven't studied it in enough detail.

Sure. I've done what I thought is the right thing to do, but please check. Stephen also wanted to review RLS related code; I don't know if he had chance to do so.
 
It seems
like getting this patch committable is now a matter of verifying
details. Of course, there are a lot of details to verify.  :-)

Let's work through them together and hopefully we shall get there.
 



* MERGE code needs to do cardinality violations like ON CONFLICT.
Specifically, you need the TransactionIdIsCurrentTransactionId()
check, and a second error fallback "attempted to lock invisible tuple"
error (though this should say "attempted to update or delete invisible
tuple" instead). The extra check is redundant, but better safe than
sorry. I like defensive errors like this because it makes the code
easier to read -- I don't get distracted by their absence.

Ok. Fixed. I also added the missing case for HeapTupleInvisible, though I don't see how we can reach there. Since ExecUpdate()/ExecDelete() always wait for any concurrent transaction to finish, I don't see how we can reach HeapTupleBeingUpdated. So I skipped that test (or may be we should just add an error for completeness).
 

* The last item on cardinality violations also implies this new merge
comment should be changed:

> +                   /*
> +                    * The target tuple was already updated or deleted by the
> +                    * current command, or by a later command in the current
> +                    * transaction.
> +                    */

It should be changed because it can't have been a later command in
this xact. We handled that case when we called ExecUpdate or
ExecDelete() (plus we now have the extra defensive "can't happen"
elog() error).

Removed.
 

* This related comment shouldn't even talk about update/historic
behavior, now that it isn't in ExecUpdate() -- just MERGE:

> +                   /*
> +                    * The former case is possible in a join UPDATE where
> +                    * multiple tuples join to the same target tuple. This is
> +                    * pretty questionable, but Postgres has always allowed
> +                    * it: we just execute the first update action and ignore
> +                    * additional update attempts.  SQLStandard disallows this
> +                    * for MERGE, so allow the caller to select how to handle
> +                    * this.
> +                    */

Removed.
 

* This wording should be tweaked:

> +                    * If the current tuple is that last tuple in the update
> +                    * chain, then we know that the tuple was concurrently
> +                    * deleted. We just switch to NOT MATCHED case and let the
> +                    * caller retry the NOT MATCHED actions.

This should say something like "caller can move on to NOT MATCHED
actions". They can never go back from there, of course, which I want
us to be clear on.

Fixed, but please check if the new wording is OK.
 

* This check of whether whenqual is set is unnecessary, and doesn't
match MATCHED code, or the accompanying comments:

> +       /*
> +        * Test condition, if any
> +        *
> +        * In the absence of a condition we perform the action unconditionally
> +        * (no need to check separately since ExecQual() will return true if
> +        * there are no conditions to evaluate).
> +        */
> +       if (action->whenqual && !ExecQual(action->whenqual, econtext))
> +           continue;

Fixed.
 

* I think that this ExecMerge() assertion is not helpful, since you go
on to dereference the pointer in all cases anyway:

> +   Assert(junkfilter != NULL);

Removed.
 

* Executor README changes, particularly about projecting twice, really
should be ExecMerge() comments. Maybe just get rid of these?

Fixed, with addition to TABLEOID column that we now fetch along with CTID column.
 

* Why are we using CMD_NOTHING at all? That constant has something to
do with user-visible rules, and there is no need to reuse it (make a
new CMD_* if you have to). More importantly, why do we even have the
corresponding DO NOTHING stuff in the grammar? Why would users want
that?

For quite a while, I thought that patch must have been support for ON
CONFLICT DO NOTHING within MERGE INSERTs (the docs don't even say what
DO NOTHING is). But that's not what it is at all. It seems like this
is a way of having an action that terminates early, so you don't have
to go on to evaluate other action quals.

Hmm. I was under the impression that the SQL Standards support DO NOTHING. But now that I read the standards, I can't find any mention of DO NOTHING. It might still be useful for users to skip all (matched/not-matched/both) actions for some specific conditions. For example,

WHEN MATCHED AND balance > min_balance THEN
   DO NOTHING
WHEN MATCHED AND custcat = 'priority' THEN
   DO NOTHING
....

But I agree that if we remove DO NOTHING, we can simplify the code. Or we can probably re-arrange the code to check DO NOTHING early and exit the loop. That simplifies it a lot, as in attached. In passing, I also realised that the target tuple may also be needed for DO NOTHING actions since they may have quals attached to them. Also, the tuple should really be fetched just once, not once per action. Those changes are done.
 

* The docs say that we use a LEFT OUTER JOIN for MERGE, while the
implementation uses a RIGHT OUTER JOIN because it's convenient for
parse analysis finding the target. This difference is a bit confusing.
Why say that we use any kind of join at all, though?

Hmm, right. In fact, we revert to INNER JOIN when there are no NOT MATCHED actions, so probably we should just not mention any kind of joins, definitely not in the user documentation.
 

The SQL standard doesn't say outer join at all. Neither do the MERGE
docs of the other database systems that I checked. Taking a position
on this seems to add nothing at all. Let's make the MERGE docs refer
to it as a join, without further elaboration.

* I don't find this comment from analyze.c very helpful:

> +        * We don't have a separate plan for each action, so the when
> +        * condition must be executed as a per-row check, making it very
> +        * similar to a CHECK constraint and so we adopt the same semantics
> +        * for that.

Why explain it that way at all? There are two rels, unlike a check constraint.

Agreed. I think it was left-over from the time when sub-selects were not allowed and we were using EXPR_KIND_CHECK_CONSTRAINT to run those checks. Now we have a separate expression kind and we do support sub-selects. 
 

* The first time I read this comment, it made me laugh:

> +       /*
> +        * First rule of MERGE club is we don't talk about rules
> +        */

The joke has become significantly less funny since then, though. I'd
just say that MERGE doesn't support rules, as it's unclear how that
could work.

Changed that way.
 

* This comment seems redundant, since pstate is always allocated with palloc0():

+ * Note: we assume that the pstate's p_rtable, p_joinlist, and p_namespace
+ * lists were initialized to NIL when the pstate was created.

make_parsestate() knows about this. This is widespread, normal
practice during parse analysis.

Ok. Removed.
 

* Is this actually needed at all?:

> +       /* In MERGE when and condition, no system column is allowed */
> +       if (pstate->p_expr_kind == EXPR_KIND_MERGE_WHEN_AND &&
> +           attnum < InvalidAttrNumber &&
> +           !(attnum == TableOidAttributeNumber || attnum == ObjectIdAttributeNumber))
> +           ereport(ERROR,
> +                   (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
> +                    errmsg("system column \"%s\" reference in WHEN AND condition is invalid",
> +                           colname),
> +                    parser_errposition(pstate, location)));

We're talking about the scantuple here. It's not like excluded.*.

We might be able to support them, but do we really care?
 

* Are these checks really needed?:

> +void
> +rewriteTargetListMerge(Query *parsetree, Relation target_relation)
> +{
> +   Var        *var = NULL;
> +   const char *attrname;
> +   TargetEntry *tle;
> +
> +   if (target_relation->rd_rel->relkind == RELKIND_RELATION ||
> +       target_relation->rd_rel->relkind == RELKIND_MATVIEW ||
> +       target_relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)


You're right. Those checks are done at the beginning and we should probably just turn this into an Assert, just like ExecMerge(). Changed that way.
 
* I think that you should remove this debug include:

> index 3a02307bd9..ed4e857477 100644
> --- a/src/backend/parser/parse_clause.c
> +++ b/src/backend/parser/parse_clause.c
> @@ -31,6 +31,7 @@
>  #include "commands/defrem.h"
>  #include "nodes/makefuncs.h"
>  #include "nodes/nodeFuncs.h"
> +#include "nodes/print.h"
>  #include "optimizer/tlist.h"
>  #include "optimizer/var.h"
>  #include "parser/analyze.h"

Done.
 

It may be worth considering custom dprintf code within your .gdbinit
to do stuff like this automatically, without code changes that may end
up in the patch you post to the list. Here is one that I sometimes use
for tuplesort.c:

define print_tuplesort_counts
  dprintf tuplesort_sort_memtuples, "memtupsize: %d, memtupcount:
%d\n", state->memtupsize, state->memtupcount
end

This is way easier than adding custom printf style debug code, since
it doesn't require that you rebuild. It would be safe to add a similar
dprintf that called pprint() or similar when some function is reached.

Ok. Thanks for the hint.
 

* find_mergetarget_parents() and find_mergetarget_for_rel() could both
use at least a one-liner header comment.

Done. It was indeed confusing, so comments should help.
 

* This analyze.c comment, which is in transformMergeStmt(), seems
pretty questionable:

> +   /*
> +    * Simplify the MERGE query as much as possible
> +    *
> +    * These seem like things that could go into Optimizer, but they are
> +    * semantic simplications rather than optimizations, per se.
> +    *
> +    * If there are no INSERT actions we won't be using the non-matching
> +    * candidate rows for anything, so no need for an outer join. We do still
> +    * need an inner join for UPDATE and DELETE actions.

This is talking about an ad-hoc form of join strength reduction. Yeah,
it's a semantic simplification, but that's generally true of join
strength reduction, which mostly exists because of bad ORMs that
sprinkle OUTER on top of queries indifferently. MERGE is hardly an
ideal candidate for join strength reduction, and I would just lose
this code entirely for Postgres v11.

I am not sure if there could be additional query optimisation possibilities when a RIGHT OUTER JOIN is changed to INNER JOIN. Apart from existing optimisations, I am also thinking about some of the work that David and Amit are doing for partition pruning and I wonder if the choice of join might have a non-trivial effect. Having said that, I am yet to explore those things. But when we definitely know that a different kind of JOIN is all we need (because there are no NOT MATCHED actions), why not use that? Or do you see a problem there?
 

* This sounds really brittle, and so doesn't make sense even as an aspiration:

> +    * XXX if we were really keen we could look through the actionList and
> +    * pull out common conditions, if there were no terminal clauses and put
> +    * them into the main query as an early row filter but that seems like an
> +    * atypical case and so checking for it would be likely to just be wasted
> +    * effort.
> +    */

We might actually want to do this, if not for v11 then later. I am not sure if we should keep it in the final commit though.
 .

* You should say *when* this happens (what later point):

> +    *
> +    * Track the RTE index of the target table used in the join query. This is
> +    * later used to add required junk attributes to the targetlist.
> +    */

Done.
 

* This seems unnecessary, as we don't say anything like it for ON
CONFLICT DO UPDATE:

> +    * As top-level statements INSERT, UPDATE and DELETE have a Query, whereas
> +    * with MERGE the individual actions do not require separate planning,
> +    * only different handling in the executor. See nodeModifyTable handling
> +    * of commandType CMD_MERGE.


I don't see anything wrong with keeping it. May be it needs rewording?
 
* What does this mean?:

> +    * A sub-query can include the Target, but otherwise the sub-query cannot
> +    * reference the outermost Target table at all.

Don't know :-) I think what Simon probably meant to say that you can't reference the Target table in the sub-query used in the source relation. But you can add the Target as a separate RTE
 

* I don't see the point in this:

> +        * XXX Perhaps we require Parallel Safety since that is a superset of
> +        * the restriction and enforcing that makes it easier to consider
> +        * running MERGE plans in parallel in future.


Yeah, removed.
 
* This references the now-removed executor WAL check thing, so you'll
need to remove it too:

> +        * SQL Standard says we should not allow anything that possibly
> +        * modifies SQL-data. We enforce that with an executor check that we
> +        * have not written any WAL.


Yes, removed too.
 
* This should be reworded:

> +        * Note that we don't add this to the MERGE Query's quals because
> +        * that's not the logic MERGE uses.
> +        */
> +       action->qual = transformWhereClause(pstate, action->condition,
> +                                           EXPR_KIND_MERGE_WHEN_AND, "WHEN");

Perhaps say "WHEN ... AND quals are evaluated separately from the
MERGE statement's join quals" instead. Or just lose it altogether.


Reworded slightly.
 
* This comment seems inaccurate:

> +                       /*
> +                        * Process INSERT ... VALUES with a single VALUES
> +                        * sublist.  We treat this case separately for
> +                        * efficiency.  The sublist is just computed directly
> +                        * as the Query's targetlist, with no VALUES RTE.  So
> +                        * it works just like a SELECT without any FROM.
> +                        */

Wouldn't it be more accurate to say that this it totally different to
what transformInsertStmt() does for a SELECT without any FROM? Also,
do you think that that's good?

Actually it's pretty much same as what transformInsertStmt(). Even the comments are verbatim copied from there. I don't see anything wrong with the comment or the code. Can you please explain what problem you see?
 

>> * Tests for transition table behavior (mixed INSERTs, UPDATEs, and
>> DELETEs) within triggers.sql seems like a good idea.
>
> Ok, I will add. But not done in this version.

Note that it's implied in at least one place that we don't support
transition tables at all:

> +               /*
> +                * XXX if we support transition tables this would need to move
> +                * earlier before ExecSetupTransitionCaptureState()
> +                */
> +               switch (action->commandType)
> +               {

You'll want to get to this as part of that transition table effort.

I actually didn't even know what transition tables are until today. But today I studied them and the new version now supports transition tables with MERGE. We might consider it to WIP given the amount of time I've spent coding that, though I am fairly happy with the result so far. The comment above turned out to be bogus.

I decided to create new tuplestores and mark them as new_update, old_update, new_insert and old_delete. This is necessary because MERGE can run all three kinds of commands i.e. UPDATE, DELETE and INSERT, and we would like to track their transition tables separately.

(Hmm.. I just noticed though INSERT ON CONFLICT does not do this and still able to track transition tables for INSERT and UPDATE correctly. So may be what I did wasn't necessary after all, though it's also likely that we can get IOC to use this new mechanism)
 

Any plan to fix this/support identity columns? I see that you don't
support them here:

> +                   /*
> +                    * Handle INSERT much like in transformInsertStmt
> +                    *
> +                    * XXX currently ignore stmt->override, if present
> +                    */


I have already added support for OVERRIDING. The comment needed adjustments which I have done now.
 
I think that this is a blocker, unfortunately.

You mean OVERRIDING or ruleutils?
 

>> * I still think we probably need to add something to ruleutils.c, so
>> that MERGE Query structs can be deparsed -- see get_query_def().
>
> Ok. I will look at it. Not done in this version though.

I also wonder what it would take to support referencing CTEs. Other
implementations do have this. Why can't we?

Hmm. I will look at them. But TBH I would like to postpone them to v12 if they turn out to be tricky. We have already covered a very large ground in the last few weeks, but we're approaching the feature freeze date.
 

Phew! Thanks for your patience and perseverance. I do have more
feedback on the docs lined up, but that isn't so important right now.

Thanks! Those were really useful review comments. In passing, I made some updates to the doc, but I really should make a complete pass over the patch.

Thanks,
Pavan

-- 
Pavan Deolasee                   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Sergei Kornilov
Date:
Subject: Re: using index or check in ALTER TABLE SET NOT NULL
Next
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] MERGE SQL Statement for PG11