Home > mailing lists
Re: [HACKERS] MERGE SQL Statement for PG11 - Mailing list pgsql-hackers

From	Simon Riggs
Subject	Re: [HACKERS] MERGE SQL Statement for PG11
Date	January 24, 2018 10:12:16
Msg-id	CANP8+jKTuZ5NDCLkqXBm9kMeaT-jNxxhkMg+FOpLni=pwOQ=BA@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] MERGE SQL Statement for PG11 (Peter Geoghegan <pg@bowt.ie>)
Responses	Re: [HACKERS] MERGE SQL Statement for PG11
List	pgsql-hackers
Tree view
On 24 January 2018 at 01:35, Peter Geoghegan <pg@bowt.ie> wrote:
> On Tue, Jan 23, 2018 at 5:51 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> Not yet working
>>> * Partitioning
>>> * RLS
>>>
>>> Based on this successful progress I imagine I'll be looking to commit
>>> this by the end of the CF, allowing us 2 further months to bugfix.
>>
>> This is complete and pretty clean now. 1200 lines of code, plus docs and tests.
>
> That timeline seems aggressive to me.

Thank you for the review.

> Also, the patch appears to have
> bitrot. Please rebase, and post a new version.

Will do, though I'm sure that's only minor since we rebased only a few days ago.

> Some feedback based on a short read-through of your v11:
>
> * What's the postgres_fdw status?

MERGE currently works with normal relations and materialized views only.

> * Relatedly, when/where does applying the junkfilter happen, if not here?:

A few lines later in the same file. Junkfilters are still used.

> * Isn't "consider INSERT ... ON CONFLICT DO UPDATE" obsolete in these
> doc changes?:
>
>> -F312   MERGE statement         NO  consider INSERT ... ON CONFLICT DO UPDATE
>> -F313   Enhanced MERGE statement            NO
>> -F314   MERGE statement with DELETE branch          NO
>> +F312   MERGE statement         YES consider INSERT ... ON CONFLICT DO UPDATE
>> +F313   Enhanced MERGE statement            YES
>> +F314   MERGE statement with DELETE branch          YES

I think that it is still useful to highlight the existence of a
non-standard feature which people might not be otherwise unaware.

If you wish me to remove a reference to your work, I will bow to your wish.

> * What's the deal with transition tables? Your docs say this:
>
>> +   <varlistentry>
>> +    <term><replaceable class="parameter">source_table_name</replaceable></term>
>> +    <listitem>
>> +     <para>
>> +      The name (optionally schema-qualified) of the source table, view or
>> +      transition table.
>> +     </para>
>> +    </listitem>
>> +   </varlistentry>
>
> But the code says this:
>
>> +           /*
>> +            * XXX if we support transition tables this would need to move earlier
>> +            * before ExecSetupTransitionCaptureState()
>> +            */
>> +           switch (action->commandType)

Changes made by MERGE can theoretically be visible in a transition
table. When I tried to implement that there were problems caused by
the way INSERT ON CONFLICT has been implemented, so it will take a
fair amount of study and lifting to make that work. For now, they are
not supported and generate an error message. That behaviour was not
documented, but I've done that now.

SQL Std says "Transition tables are not generally updatable (and
therefore not simply updatable) and their columns are not updatable."

So MERGE cannot be a target of a transition table, but it can be the
source of one. So the docs you cite are correct, as is the comment.

> * Can UPDATEs themselves accept an UPDATE-style WHERE clause, in
> addition to the WHEN quals, and the main ON join quals?
>
> I don't see any examples of this in your regression tests.

No, they can't. Oracle allows it but it isn't SQL Standard.

> * You added a new MERGE command tag. Shouldn't there be changes to
> libpq's fe-exec.c, to go along with that?

Nice catch. One liner patch and docs updated in repo for next patch.

> * I noticed this:
>
>> + else if (node->operation == CMD_MERGE)
>> + {
>> +     /*
>> +     * XXX Add more detailed instrumentation for MERGE changes
>> +     * when running EXPLAIN ANALYZE?
>> +     */
>> + }
>
> I think that doing better here is necessary.

Please define better. It may be possible.

> * I'm not a fan of stored rules, but I still think that this needs to
> be discussed:
>
>> -       product_queries = fireRules(parsetree,
>> +       /*
>> +        * First rule of MERGE club is we don't talk about rules
>> +        */
>> +       if (event == CMD_MERGE)
>> +           product_queries = NIL;
>> +       else
>> +           product_queries = fireRules(parsetree,
>>                                     result_relation,
>>                                     event,
>>                                     locks,

It has been discussed, in my recent proposal, mentioned in the doc
page for clarity.

It has also been discussed previously in discussions around MERGE.
It's not clear how they would work, but we already have an example of
a statement type that doesn't support rules.

> * This seems totally unnecessary at first blush:
>
>> +                           /*
>> +                            * Lock tuple for update.
>> +                            *
>> +                            * XXX Is this really needed? I put this in
>> +                            * just to get hold of the existing tuple.
>> +                            * But if we do need, then we probably
>> +                            * should be looking at the return value of
>> +                            * heap_lock_tuple() and take appropriate
>> +                            * action.
>> +                            */
>> +                           tuple.t_self = *tupleid;
>> +                           test = heap_lock_tuple(relation, &tuple, estate->es_output_cid,
>> +                                   lockmode, LockWaitBlock, false, &buffer,
>> +                                   &hufd);
>
> * I don't know what the deal is with EPQ and MERGE in general, because
> just after the above heap_lock_tuple(), you do this:
>
>> +                       /*
>> +                        * Test condition, if any
>> +                        *
>> +                        * In the absence of a condition we perform the action
>> +                        * unconditionally (no need to check separately since
>> +                        * ExecQual() will return true if there are no
>> +                        * conditions to evaluate).
>> +                        */
>> +                       if (!ExecQual(action->whenqual, econtext))
>> +                       {
>> +                           if (BufferIsValid(buffer))
>> +                               ReleaseBuffer(buffer);
>> +                           continue;
>> +                       }
>
> Maybe *this* is why you call heap_lock_tuple(), actually, but that's
> not clear, and in any case I don't see any point in it if you don't
> check heap_lock_tuple()'s return value, and do some other extra thing
> on the basis of that return value. No existing heap_lock_tuple()
> ignores its return value, and ignoring the return value simply can't
> make sense.

Agreed, I don't think it is necessary.

> Don't WHEN quals need to participate in EPQ reevaluation, in order to
> preserve the behavior that we see with UPDATE and DELETE? Why, or why
> not?

WHEN qual evaluation occurs to establish which action to take. The
Standard is clear that this happens prior to the action.

> I suppose that the way you evaluate WHEN quals separately makes a
> certain amount of sense, but I think that you need to get things
> straight with WHEN quals and READ COMMITTED conflict handling at a
> high level (the semantics), which may or may not mean that WHEN quals
> participate in EPQ evaluation. If you're going to introduce a special
> case, I think you need to note it under the EvalPlanQual section of
> the executor README.

I wasn't going to introduce a special case.

> This much I'm sure of: you should reevaluate WHEN quals if the UPDATE
> chain is walked in READ COMMITTED mode, one way or another. It might
> end up happening in your new heap_lock_tuple() retry loop, or you
> might do something differently with EPQ, but something should happen
> (I haven't got an opinion on the implementation details just yet,
> though).

An interesting point, thank you for raising it.

WHEN quals, if they exist, may have dependencies on either the source
or target. If there is a dependency on the target there might be an
issue.

If the WHEN has a condition AND the WHEN qual fails a re-check after
we do EPQ, then I think we should just throw an error. If we
re-evaluate everything, I'm sure we'll get into some weird cases that
make MATCHED/NOT MATCHED change and that is a certain error case for
MERGE.

We might do better than that after some thought, but that seems like a
rabbit hole we should avoid in the first release. As we agreed
earlier, we can later extend MERGE to produce less errors in certain
concurrency cases.

> * Your isolation test should be commented. I'd expect you to talk
> about what is different about MERGE as far as concurrency goes, if
> anything. I note that you don't use additional WHEN quals in your
> isolation test at all (just simple WHEN NOT MATCHED + WHEN MATCHED),
> which was the first thing I looked for there. Look at
> insert-conflict-do-update-3.spec for an example of roughly the kind of
> commentary I had hoped to see in your regression test.

I will be happy to add one that exercises some new code resulting from
the above.

>> I'm expecting to commit this and then come back for the Partitioning &
>> RLS later, but will wait a few days for comments and other reviews.
>
> I don't think that it's okay to defer RLS or partitioning support till
> after an initial commit. While it's probably true that MERGE can just
> follow ON CONFLICT's example when it comes to column-level privileges,
> this won't be true with RLS.

I think it is OK to do things in major pieces, as has been done with
many other commands. We have more time in this release to do that,
though we want to find and fix any issues in basic functionality like
concurrency ahead of trying to add fancy stuff and hitting problems
with it.

I've already made two changes your review has raised, thanks. Will
re-post soon. Thanks.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date:
From: Tom Lane
Date: 24 January 2018, 10:08:38
Subject: Re: [PATCH] fix for C4141 warning on MSVC
From: Tom Lane
Date: 24 January 2018, 10:16:23
Subject: Re: documentation is now XML
Re: [HACKERS] MERGE SQL Statement for PG11 - Mailing list pgsql-hackers

Previous

Next