Re: SERIALIZABLE and INSERTs with multiple VALUES - Mailing list pgsql-general

From Thomas Munro
Subject Re: SERIALIZABLE and INSERTs with multiple VALUES
Date
Msg-id CAEepm=285cYe-6NE1K0iT4TJQPCDvCxauo0ROO9shrMcRx6o2w@mail.gmail.com
Whole thread Raw
In response to Re: SERIALIZABLE and INSERTs with multiple VALUES  (Kevin Grittner <kgrittn@gmail.com>)
Responses Re: SERIALIZABLE and INSERTs with multiple VALUES  (Peter Geoghegan <pg@bowt.ie>)
Re: SERIALIZABLE and INSERTs with multiple VALUES  (Kevin Grittner <kgrittn@gmail.com>)
List pgsql-general
On Thu, Oct 13, 2016 at 10:06 AM, Kevin Grittner <kgrittn@gmail.com> wrote:
> On Wed, Oct 12, 2016 at 3:02 PM, Peter Geoghegan <pg@bowt.ie> wrote:
>
>> I agree that the multi-value case is a bug.
>
>> I think that it should be pretty obvious to you why the check exists
>> at all, Kevin. It exists because it would be improper to decide to
>> take the DO NOTHING path on the basis of some other committed tuple
>> existing that is not visible to the original MVCC snapshot, at higher
>> isolation levels.
>
> That's only true if it causes a cycle in the apparent order of
> execution.  If we rip out the check what we have is behavior
> completely consistent with the other transaction executing first;
> in other words, it creates a read-write dependency (a/k/a
> rw-conflict) from the current transaction to the concurrent
> transaction which succeeds in its insert.  That may or may not
> cause a cycle, depending on what else is happening.

The "higher isolation levels" probably shouldn't be treated the same way.

I think Peter's right about REPEATABLE READ.  We should definitely
raise the error immediately as we do in that level, because our RR
(SI) doesn't care about write skew and all that stuff, it just
promises that you can only see data in your snapshot.  We can't allow
you to take a different course of action based on data that your
snapshot can't see, so the only reasonable thing to do is abandon
ship.

But yeah, the existing code raises false positive serialization
failures under SERIALIZABLE, and that's visible in the isolation test
I posted: there is actually a serial order of those transactions with
the same result.

When working on commit fcff8a57 I became suspicious of the way ON
CONFLICT interacts with SSI, as I mentioned in passing back then[1],
thinking mainly of false negatives.  I failed to find a
non-serializable schedule involving ON CONFLICT that was allowed to
run, though I didn't spend much time on it.   One thing that worries
me is the final permutation of read-write-unique-4.spec, which
produces an arguably spurious UCV, that is, a transaction that doesn't
commit but raises a UCV instead of the serialization failure you might
expect.  The ON CONFLICT equivalent might be a transaction that takes
the ON CONFLICT path and then commits, even though it should be
considered non-serializable.  I would really like to understand that
case better, and until then I wouldn't bet my boots that it isn't
possible to commit anomalies using ON CONFLICT under SERIALIZABLE
without Peter's check (or even with it), despite the fact that it
reaches predicate locking code via heap_fetch etc.

[1] https://www.postgresql.org/message-id/CAEepm%3D2kYCegxp9qMR5TM1X3oXHj16iYzLPj_go52R2R07EvnA%40mail.gmail.com

--
Thomas Munro
http://www.enterprisedb.com


pgsql-general by date:

Previous
From: "AnandKumar, Karthik"
Date:
Subject: Re: MultiXact member wraparound protections are disabled
Next
From: PHANIKUMAR G
Date:
Subject: Re: postgres failed to start from services manager on windows 2008 r2