Re: [HACKERS] MERGE SQL Statement for PG11 - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: [HACKERS] MERGE SQL Statement for PG11 |
Date | |
Msg-id | CANP8+jKTuZ5NDCLkqXBm9kMeaT-jNxxhkMg+FOpLni=pwOQ=BA@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] MERGE SQL Statement for PG11 (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: [HACKERS] MERGE SQL Statement for PG11
|
List | pgsql-hackers |
On 24 January 2018 at 01:35, Peter Geoghegan <pg@bowt.ie> wrote: > On Tue, Jan 23, 2018 at 5:51 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> Not yet working >>> * Partitioning >>> * RLS >>> >>> Based on this successful progress I imagine I'll be looking to commit >>> this by the end of the CF, allowing us 2 further months to bugfix. >> >> This is complete and pretty clean now. 1200 lines of code, plus docs and tests. > > That timeline seems aggressive to me. Thank you for the review. > Also, the patch appears to have > bitrot. Please rebase, and post a new version. Will do, though I'm sure that's only minor since we rebased only a few days ago. > Some feedback based on a short read-through of your v11: > > * What's the postgres_fdw status? MERGE currently works with normal relations and materialized views only. > * Relatedly, when/where does applying the junkfilter happen, if not here?: A few lines later in the same file. Junkfilters are still used. > * Isn't "consider INSERT ... ON CONFLICT DO UPDATE" obsolete in these > doc changes?: > >> -F312 MERGE statement NO consider INSERT ... ON CONFLICT DO UPDATE >> -F313 Enhanced MERGE statement NO >> -F314 MERGE statement with DELETE branch NO >> +F312 MERGE statement YES consider INSERT ... ON CONFLICT DO UPDATE >> +F313 Enhanced MERGE statement YES >> +F314 MERGE statement with DELETE branch YES I think that it is still useful to highlight the existence of a non-standard feature which people might not be otherwise unaware. If you wish me to remove a reference to your work, I will bow to your wish. > * What's the deal with transition tables? Your docs say this: > >> + <varlistentry> >> + <term><replaceable class="parameter">source_table_name</replaceable></term> >> + <listitem> >> + <para> >> + The name (optionally schema-qualified) of the source table, view or >> + transition table. >> + </para> >> + </listitem> >> + </varlistentry> > > But the code says this: > >> + /* >> + * XXX if we support transition tables this would need to move earlier >> + * before ExecSetupTransitionCaptureState() >> + */ >> + switch (action->commandType) Changes made by MERGE can theoretically be visible in a transition table. When I tried to implement that there were problems caused by the way INSERT ON CONFLICT has been implemented, so it will take a fair amount of study and lifting to make that work. For now, they are not supported and generate an error message. That behaviour was not documented, but I've done that now. SQL Std says "Transition tables are not generally updatable (and therefore not simply updatable) and their columns are not updatable." So MERGE cannot be a target of a transition table, but it can be the source of one. So the docs you cite are correct, as is the comment. > * Can UPDATEs themselves accept an UPDATE-style WHERE clause, in > addition to the WHEN quals, and the main ON join quals? > > I don't see any examples of this in your regression tests. No, they can't. Oracle allows it but it isn't SQL Standard. > * You added a new MERGE command tag. Shouldn't there be changes to > libpq's fe-exec.c, to go along with that? Nice catch. One liner patch and docs updated in repo for next patch. > * I noticed this: > >> + else if (node->operation == CMD_MERGE) >> + { >> + /* >> + * XXX Add more detailed instrumentation for MERGE changes >> + * when running EXPLAIN ANALYZE? >> + */ >> + } > > I think that doing better here is necessary. Please define better. It may be possible. > * I'm not a fan of stored rules, but I still think that this needs to > be discussed: > >> - product_queries = fireRules(parsetree, >> + /* >> + * First rule of MERGE club is we don't talk about rules >> + */ >> + if (event == CMD_MERGE) >> + product_queries = NIL; >> + else >> + product_queries = fireRules(parsetree, >> result_relation, >> event, >> locks, It has been discussed, in my recent proposal, mentioned in the doc page for clarity. It has also been discussed previously in discussions around MERGE. It's not clear how they would work, but we already have an example of a statement type that doesn't support rules. > * This seems totally unnecessary at first blush: > >> + /* >> + * Lock tuple for update. >> + * >> + * XXX Is this really needed? I put this in >> + * just to get hold of the existing tuple. >> + * But if we do need, then we probably >> + * should be looking at the return value of >> + * heap_lock_tuple() and take appropriate >> + * action. >> + */ >> + tuple.t_self = *tupleid; >> + test = heap_lock_tuple(relation, &tuple, estate->es_output_cid, >> + lockmode, LockWaitBlock, false, &buffer, >> + &hufd); > > * I don't know what the deal is with EPQ and MERGE in general, because > just after the above heap_lock_tuple(), you do this: > >> + /* >> + * Test condition, if any >> + * >> + * In the absence of a condition we perform the action >> + * unconditionally (no need to check separately since >> + * ExecQual() will return true if there are no >> + * conditions to evaluate). >> + */ >> + if (!ExecQual(action->whenqual, econtext)) >> + { >> + if (BufferIsValid(buffer)) >> + ReleaseBuffer(buffer); >> + continue; >> + } > > Maybe *this* is why you call heap_lock_tuple(), actually, but that's > not clear, and in any case I don't see any point in it if you don't > check heap_lock_tuple()'s return value, and do some other extra thing > on the basis of that return value. No existing heap_lock_tuple() > ignores its return value, and ignoring the return value simply can't > make sense. Agreed, I don't think it is necessary. > Don't WHEN quals need to participate in EPQ reevaluation, in order to > preserve the behavior that we see with UPDATE and DELETE? Why, or why > not? WHEN qual evaluation occurs to establish which action to take. The Standard is clear that this happens prior to the action. > I suppose that the way you evaluate WHEN quals separately makes a > certain amount of sense, but I think that you need to get things > straight with WHEN quals and READ COMMITTED conflict handling at a > high level (the semantics), which may or may not mean that WHEN quals > participate in EPQ evaluation. If you're going to introduce a special > case, I think you need to note it under the EvalPlanQual section of > the executor README. I wasn't going to introduce a special case. > This much I'm sure of: you should reevaluate WHEN quals if the UPDATE > chain is walked in READ COMMITTED mode, one way or another. It might > end up happening in your new heap_lock_tuple() retry loop, or you > might do something differently with EPQ, but something should happen > (I haven't got an opinion on the implementation details just yet, > though). An interesting point, thank you for raising it. WHEN quals, if they exist, may have dependencies on either the source or target. If there is a dependency on the target there might be an issue. If the WHEN has a condition AND the WHEN qual fails a re-check after we do EPQ, then I think we should just throw an error. If we re-evaluate everything, I'm sure we'll get into some weird cases that make MATCHED/NOT MATCHED change and that is a certain error case for MERGE. We might do better than that after some thought, but that seems like a rabbit hole we should avoid in the first release. As we agreed earlier, we can later extend MERGE to produce less errors in certain concurrency cases. > * Your isolation test should be commented. I'd expect you to talk > about what is different about MERGE as far as concurrency goes, if > anything. I note that you don't use additional WHEN quals in your > isolation test at all (just simple WHEN NOT MATCHED + WHEN MATCHED), > which was the first thing I looked for there. Look at > insert-conflict-do-update-3.spec for an example of roughly the kind of > commentary I had hoped to see in your regression test. I will be happy to add one that exercises some new code resulting from the above. >> I'm expecting to commit this and then come back for the Partitioning & >> RLS later, but will wait a few days for comments and other reviews. > > I don't think that it's okay to defer RLS or partitioning support till > after an initial commit. While it's probably true that MERGE can just > follow ON CONFLICT's example when it comes to column-level privileges, > this won't be true with RLS. I think it is OK to do things in major pieces, as has been done with many other commands. We have more time in this release to do that, though we want to find and fix any issues in basic functionality like concurrency ahead of trying to add fancy stuff and hitting problems with it. I've already made two changes your review has raised, thanks. Will re-post soon. Thanks. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: