Re: SERIALIZABLE and INSERTs with multiple VALUES - Mailing list pgsql-general

From Kevin Grittner
Subject Re: SERIALIZABLE and INSERTs with multiple VALUES
Date
Msg-id CACjxUsOeYLwmnU2sutYE46aSFFv+bdNgUG4YrUTmh0T-BXLJTg@mail.gmail.com
Whole thread Raw
In response to Re: SERIALIZABLE and INSERTs with multiple VALUES  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: SERIALIZABLE and INSERTs with multiple VALUES
List pgsql-general
On Thu, Oct 13, 2016 at 5:26 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Fri, Oct 14, 2016 at 2:04 AM, Kevin Grittner <kgrittn@gmail.com> wrote:

>> Where do you see a problem if REPEATABLE READ handles INSERT/ON
>> CONFLICT without error?

> I think the ON CONFLICT
> equivalent might be something like the following (rather contrived)
> schedule, which happily commits if you comment out Peter's check:
>
> (1)  postgres=# create table bank_account (id int primary key, cash int);
> (1)  CREATE TABLE
> (1)  postgres=# begin transaction isolation level serializable ;
> (1)  BEGIN
>
> (2)      postgres=# begin transaction isolation level serializable ;
> (2)      BEGIN
>
> (1)  postgres=# select * from bank_account where id = 1;
> (1)  ┌────┬──────┐
> (1)  │ id │ cash │
> (1)  ├────┼──────┤
> (1)  └────┴──────┘
> (1)  (0 rows)
>
> (2)      postgres=# insert into bank_account values (1, 100);
> (2)      INSERT 0 1
>
> (1)  postgres=# insert into bank_account values (1, 200) on conflict do nothing;
> (1)  ...waits for tx2...
>
> (2)      postgres=# commit;
> (2)      COMMIT
>
> (1)  INSERT 0 0
> (1)  postgres=# commit;
> (1)  COMMIT
>
> If tx1 ran before tx2, then it would have succeeded in inserting (1,
> 200), and tx2 would have failed with unique_violation.  If tx2 ran
> before tx1, then tx1's SELECT command would have seen (1, 100) and
> possibly taken a different course of action.  So this schedule is
> non-serializable, right?

Right.  This is a case that needs something done if we take out the
rather overzealous check that is there now.  Thanks for finding an
example.  The trick now is to generalize to find the boundaries of
what is a problem and what isn't, so we can know what we are aiming
for as an "ideal" solution, and compare possible solutions for how
close they come.

> If you remove ON CONFLICT DO NOTHING, then tx1 gets a unique_violation
> after tx2 commits, which is similar to the last case in
> read-write-unique-4.spec.  To be able to produce a cycle that SSI can
> detect, perhaps an INSERT containing an implicit uniqueness check
> would need to be modelled as a read followed by a write.  I couldn't
> make that work, but I'm not sure if it's sensible anyway: wouldn't
> overlapping transactions consisting of just a single INSERT with the
> same key then produce a false positive, instead of unique_violation in
> one transaction?

If two transactions simultaneously attempted an INSERT of the same
key, one would block (as it would now) and if the other
successfully committed the blocked transaction would then get a
serialization failure error.  If the transactions did not overlap
you would get a duplicate key error.  That would arguably be nicer
behavior than we have now.  I think that if, within a serializable
transaction, we internally add a predicate lock for each page as we
descend to the point of insertion on a unique index, we might get
exactly that behavior.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-general by date:

Previous
From: "t.dalpozzo@gmail.com"
Date:
Subject: journaled FS and and WAL
Next
From: Albe Laurenz
Date:
Subject: Re: journaled FS and and WAL