Re: [HACKERS] MERGE SQL Statement for PG11 - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] MERGE SQL Statement for PG11
Date
Msg-id CAH2-WzmhvUA7Mcv1JuGM3rjmkLQmVC-_VtS6ZiZi2CpHb4F9uQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] MERGE SQL Statement for PG11  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Responses Re: [HACKERS] MERGE SQL Statement for PG11
List pgsql-hackers
On Tue, Feb 27, 2018 at 10:19 AM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
>> I attach a rough example of this, that uses plpgsql.
>
> Thanks for writing the sample code. I understand you probably don't mean to
> suggest that we need to mimic the behaviour of the plpgsql code and the
> semantics offered by MERGE would most likely be different than what the
> plpgsql sample does. Because there are several problems with the plpgsql
> code:
>
> - It would never turn a MATCHED case into a NOT MATCHED case because of
> concurrent UPDATE/DELETE
> - The WHERE clauses attached to the UPDATE/DELETE statement should be using
> the quals attached to the WHEN clauses to ensure they are evaluated on the
> new version of the row, if needed.

It's definitely possible to poke holes in my plpgsql example, most (or
all) of which are not fixable within the confines of what plpgsql can
do. I still think that it's really useful to frame the discussion with
examples of the kind of procedural code MERGE replaces, though,
because:

* That's the explicit purpose of MERGE, according to the SQL standard.
Everyone is very clear that the join outputs rows that are separately
inserted, updated, or deleted (additional "WHEN ... AND" quals are
evaluated separately). We should be clear on that, too. We're quite
specifically replacing procedural code that follows a general pattern.
We might even give an example of such procedural code in the docs, as
SQL Server does.

* It shows that in some ways, the INSERT/UPDATE/DELETE parts are
separate "zero or one row" statements. They could do their own
snapshot acquisitions within RC mode, for example. Also, you don't get
the ON CONFLICT behavior with excluded being affected by BEFORE ROW
triggers within UPDATE expression evaluation for ON CONFLICT DO
UPDATE.

* My example is buggy, but seemingly only in a way that is just about
unavoidable -- all the bugs are in RC mode with concurrent writes.
Poking holes in what I came up with is actually useful, and may be
less confusing than discussing the same issues some other way.

There are very few users in the world that would understand these
issues. Moreover, if all affected users that have this kind of code in
the wild were to somehow magically develop a strong understanding of
this stuff overnight, even then I'm not sure that much would change.
They still use RC mode for all the usual reasons, and they mostly
won't have any way of fixing concurrency issues that is actually
acceptable to them. In short, I'm not sure that I can fix the problems
with my plpgsql code, so what chance do they have of fixing theirs?

>> >> Some novel new behavior -- "EPQ with a twist"-- is clearly necessary.
>> >> I feel a bit uneasy about it because anything that anybody suggests is
>> >> likely to be at least a bit arbitrary (EPQ itself is kind of
>> >> arbitrary). We only get to make a decision on how "EPQ with a twist"
>> >> will work once, and that should be a decision that is made following
>> >> careful deliberation. Ambiguity is much more likely to kill a patch
>> >> than a specific technical defect, at least in my experience. Somebody
>> >> can usually just fix a technical defect.
>
>
> TBH that's one reason why I like Simon's proposed behaviour of throwing
> errors in case of corner cases. I am not suggesting that's what we do at the
> end, but it's definitely worth considering.

I now feel like Simon's suggestion of throwing an error in corner
cases isn't so bad. It still seems like we could do better, but the
more I think about it, the less that seems like a cop-out. My reasons
are:

* As I already said, my plpgsql example, while buggy, might actually
be the safest way to get the intended behavior in RC mode today. I can
definitely imagine a way of dealing with concurrency that is both
safer and less prone to throwing weird errors, but the fact remains
that my example is the state of the art here, in a way.

* Simon has already introduced something that looks like "EPQ with a
twist" to me -- the steps that happen before he even raises this
error. IOW, he does something extra that is EPQ-like. He likely does a
lot better than my plpgsql example manages to, I think.

* I suspect that the kind of users that really like the ON CONFLICT DO
UPDATE's simplicity (in terms of what it guarantees them) are unlikely
to care about MERGE at all. The kind of user that cares about MERGE is
likely to have at least heard of the isolation levels.

* I do see an important difference between making likely-unexpected
errors in RC mode very unlikely, and making them *impossible*. This
patch is not ON CONFLICT DO UPDATE, though, and that strong guarantee
simply isn't on the table.

* We can all agree that *not* raising an error in the specific way
Simon proposes is possible, somehow or other. We also all agree that
avoiding the broader category of RC errors can only be taken so far
(e.g. in any event duplicate violations errors are entirely possible,
in RC mode, when a MERGE inserts a row). So this is a question of what
exact middle ground to take. Neither of the two extremes (throwing an
error on the first sign of a RC conflict, and magically preventing
concurrency anomalies) are actually on the table.

I will not block this patch because it merely makes throwing
likely-unexpected errors in RC mode "very unlikely", rather than "very
very unlikely". Not least because I have a hard time imagining a
single user caring about the difference that still exists with Simon's
less ambitious (though not entirely unambitious) version of "EPQ with
a twist".

> Here are some observations from Rahila's analysis so far. I must say,
> Oracle's handling seems quite inconsistent, especially the conditions under
> which it sometimes re-evaluates the join and sometimes don't.

> I am curiously surprised by it's behaviour of re-evaluating join only when
> certain columns are updated. It looks to me irrespective of what we choose,
> our implementation would be much superior to what Oracle offers.

I'm not that surprised that it's generally kind of arbitrary, though
the fact that you can update something while the join qual no longer
passes does seem particularly poor.

BTW, one thing that I remember very clearly from my research on MERGE
years ago is this: all of the major implementations were in some way
or other quite buggy. At least in RC mode, MERGE kind of promises
something that it really can't quite deliver. Implementations are
seemingly left to pick and choose how to paper over those cracks.

The description of what Oracle allows here does make me feel better
about our direction. Having the least-worst semantics on RC conflict
handling certainly seems good enough to me. Especially because we have
a "true UPSERT" already, unlike both Oracle and SQL Server.

> BTW I've sent v17a of the patch, which is very close to being complete from
> my perspective (except some documentation fixes/improvements). The only
> thing pending is the decision to accept or change the currently implemented
> concurrency semantics.

I need to go look at that. I'll try to take a firmer position on this.
I know that I've been saying that for a quite a while now, but my
failure to take a firmer position for so long is not because I didn't
try. It's because there is no really good answer.

-- 
Peter Geoghegan


pgsql-hackers by date:

Previous
From: Nikita Glukhov
Date:
Subject: [PATCH] Opclass parameters
Next
From: Tom Lane
Date:
Subject: Re: postgres_fdw: perform UPDATE/DELETE .. RETURNING on a join directly