Re: [HACKERS] MERGE SQL Statement for PG11 - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: [HACKERS] MERGE SQL Statement for PG11
Date
Msg-id CABOikdPhnYSyn18Rw_g37f5HXsVtn0HpjRCJnOzvOmydrDmF_A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] MERGE SQL Statement for PG11  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [HACKERS] MERGE SQL Statement for PG11  (Peter Geoghegan <pg@bowt.ie>)
Re: [HACKERS] MERGE SQL Statement for PG11  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
Hi Peter,

Thanks for the additional comments.

On Wed, Mar 21, 2018 at 2:17 AM, Peter Geoghegan <pg@bowt.ie> wrote:


* Why is terminating semi-colon on its own line in MERGE sql examples
from the docs?

I checked a few other places and didn't find another example which puts semi-colon on its own new line. So fixed per your suggestion.
 

* ON CONFLICT references in MERGE docs should be a "Tip" box, IMV.

Changed.
 

* Docs don't actually say what DO NOTHING does.

Added a para about that. In passing, I noticed that even though the grammar and the docs agree that DO NOTHING is only supported for WHEN NOT MATCHED, ExecMergeMatched() was unnecessarily checking for DO NOTHING. So fixed the code.
 

* The docs say "A condition cannot contain subqueries, set returning
functions, nor can it contain window or aggregate functions".  Thought
it can now?

Yes, it now supports sub-queries. What about set-returning, aggregates etc? I assume they are not supported in other places such as WHERE conditions and JOIN quals. So they will continue to remain blocked even in WHEN conditions. Do you think it's worth mentioning or we should not mention anything at all?
 

* The docs say "INSERT actions cannot contain sub-selects".  Didn't that change?

No, it did not. We only support VALUES clause with INSERT action.
 

* The docs on merge_insert grammar element say "Do not include the
table name", which seems like it was already covered.  It's obvious.
Suggest removing it.

Ok, fixed.
 

* The docs say "The number of rows inserted, updated and deleted can
be seen in the output of EXPLAIN ANALYZE". This seems unnecessary. We
don't see that for ON CONFLICT.

You mean the doc changes are unnecessary or the EXPLAIN ANALYZE output is unnecessary? I assume the doc changes, but let me know if that's wrong assumption.
 

* The docs say "it should be noted that no error is raised if that
occurs". But should it? I don't think that this addresses a perception
that users are likely to have.

Again, do you mean we should raise error or just that the docs should not mention anything about it? I don't think raising an error because the candidate row did not meet any specified action is a good idea. May be some day we add another option to store such rows in a separate temporary table.
 

*  "See MERGE" hyperlink within MERGE docs themselves seem odd.
Suggest changing this.

Removed.
 

* Trigger behavior doesn't belong in main MERGE doc page, IMV.  Just
like with ON CONFLICT, it should be documented where triggers are
already documented. The specifics of MERGE can be discussed there.


Ok. I added couple of paras to trigger.sgml, but left merge.sgml untouched. Suggestions for better wording are welcome.
 
* It might make sense to point out in the docs that join_condition
should not filter the target table too much. Like SQL server docs say,
don't put things in the join that filter the target that actually
belong in the WHEN .. AND quals. In a way, this should be obvious,
because it's an outer join. But I don't think it is, and ISTM that the
sensible thing to do is to warn against it.


Hmm, ok. Not sure how exactly to put that in words without confusing users. Do you want to suggest something?
 
* We never actually get around to saying that MERGE is good with bulk
loading, ETL, and so on. I think that we should remark on that in
passing.

Suggestion?
 

* I think that the mvcc.sgml changes can go. Perhaps a passing
reference to MERGE can be left behind, that makes it clear that it's
really rather like UPDATE FROM and so on. The fact that it's like
UPDATE FROM now seems crystal clear.


It seems useful to me. Should we move it to merge.sgml instead?
 
* insert.sgml need not mention MERGE IMV.


Hmm. I am not sure. It seems worth leaving a reference there since MERGE provides a new way to handle INSERTs.
 


* Most of the field comments here should be improved:

> /* ----------------
>  *   MergeActionState information
>  * ----------------
>  */
> typedef struct MergeActionState
> {
>     NodeTag     type;
>     bool        matched;        /* MATCHED or NOT MATCHED */
>     ExprState  *whenqual;       /* WHEN quals */
>     CmdType     commandType;    /* type of action */
>     TupleTableSlot *slot;       /* instead of ResultRelInfo */
>     ProjectionInfo *proj;       /* instead of ResultRelInfo */
> } MergeActionState;


Done.
 
* I wouldn't talk about a precedent like this:

> >  /*
> >   * Process BEFORE EACH STATEMENT triggers
> > + *
> > + * The precedent set by ON CONFLICT is that we fire INSERT then UPDATE.
> > + * MERGE follows the same logic, firing INSERT, then UPDATE, then DELETE.
> >   */

The comments should read as one voice. Ideally, it will look like ON
CONFLICT could have just as easily been added after MERGE, unless
there is a very compelling reason to mention a precedent. I mean, is
there really any reason to think that the precedent set by ON CONFLICT
actually made any difference? Is the suggestion here that there is a
better behavior, that we cannot go with for historical reasons?


I agree. I removed those comments.

 
* This ExecModifyTable() code seems a bit odd:

>
>         if (operation == CMD_MERGE)
>         {
>             ExecMerge(node, estate, slot, junkfilter, resultRelInfo);
>             continue;
>         }
>
>         tupleid = NULL;
>         oldtuple = NULL;
>         if (junkfilter != NULL)
>         {
>             /*
>              * extract the 'ctid' or 'wholerow' junk attribute.
>              */
>             if (operation == CMD_UPDATE ||
>                 operation == CMD_DELETE ||
>                 operation == CMD_MERGE)
>             {

Why is there a test for CMD_MERGE if control flow cannot even reach
there? What's going on here?

Just leftover bits from the previous code. Removed.
 

* No need to say "not planned" here, I think:

> +typedef struct MergeAction
> +{
> +   NodeTag     type;
> +   bool        matched;        /* MATCHED or NOT MATCHED */
> +   Node       *condition;      /* conditional expr (raw parser) */
> +   Node       *qual;           /* conditional expr (transformWhereClause) */
> +   CmdType     commandType;    /* type of action */
> +   Node       *stmt;           /* T_UpdateStmt etc - not planned */
> +   List       *targetList;     /* the target list (of ResTarget) */
> +} MergeAction;


Fixed and also improved other comments for the struct.

 
* Do tables with rules reject MERGE commands sensibly? We should have
a test for that.

That check was indeed missing. Added the check and the test.
 

* Logical decoding tests (test_decoding regression tests) seem like a
good idea. This is much less important than with ON CONFLICT, since
you don't have the significant complication of speculative insertions,
but it seems like something to add on general principle.

Good point, added a test.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Attachment

pgsql-hackers by date:

Previous
From: Michael Banck
Date:
Subject: pg_basebackup: Missing newlines in some error messages
Next
From: Daniel Gustafsson
Date:
Subject: Re: pg_basebackup: Missing newlines in some error messages