Re: SERIALIZABLE and INSERTs with multiple VALUES - Mailing list pgsql-general

From Kevin Grittner
Subject Re: SERIALIZABLE and INSERTs with multiple VALUES
Date
Msg-id CACjxUsO+eN+L-wQed3w9U2r+BvLNZUoDe5VY0smMsGQa9=rA4Q@mail.gmail.com
Whole thread Raw
In response to Re: SERIALIZABLE and INSERTs with multiple VALUES  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: SERIALIZABLE and INSERTs with multiple VALUES
List pgsql-general
On Thu, Oct 13, 2016 at 5:26 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

> (1)  postgres=# create table bank_account (id int primary key, cash int);
> (1)  CREATE TABLE
> (1)  postgres=# begin transaction isolation level serializable ;
> (1)  BEGIN
>
> (2)      postgres=# begin transaction isolation level serializable ;
> (2)      BEGIN
>
> (1)  postgres=# select * from bank_account where id = 1;
> (1)  ┌────┬──────┐
> (1)  │ id │ cash │
> (1)  ├────┼──────┤
> (1)  └────┴──────┘
> (1)  (0 rows)
>
> (2)      postgres=# insert into bank_account values (1, 100);
> (2)      INSERT 0 1
>
> (1)  postgres=# insert into bank_account values (1, 200) on conflict do nothing;
> (1)  ...waits for tx2...
>
> (2)      postgres=# commit;
> (2)      COMMIT
>
> (1)  INSERT 0 0
> (1)  postgres=# commit;
> (1)  COMMIT

This is a really diabolical example.  (Thanks for that!)

Without use of the ON CONFLICT option, using standard statements in
SERIALIZABLE transactions there would be no serialization anomaly.
If both transactions first tested for the existence of the row with
a SELECT statement tx1 would get a serialization failure; in the
example as shown tx1 would get a duplicate key error (even though
we would like to get to the point of having it get a serialization
failure).

If we took out the existing check and modified INSERT under
SERIALIZABLE transactions to acquire predicate locks during initial
insertion of the index key in the index used by the ON CONFLICT
clause (like the ones which would be acquired by a SELECT which
used the index), we would handle many cases, but not this one
(since the INSERT in tx2 does not use the ON CONFLICT clause).
This example would cause a rw-conflict from tx1 to tx2, but not
vice versa, since tx2 doesn't read.  So, no "dangerous structure"
in the SSI sense, and no serialization failure, even though the
results are not consistent with any serial execution of the
transactions.

I *think* that to generate the correct rw-conflicts to allow
dropping the existing check (recently proposed by Thomas Munro and
implemented by Tom Lane), we would need to acquire predicate locks
during the descent to insert into each unique index during any
INSERT under SERIALIZABLE transaction isolation.  The obvious
question is whether the overhead of doing that would be made up by
the work saved through reduced false positive serialization
failures.  It does not seem obvious which would win on overall
performance; in fact it seems very likely that it would depend on
the workload.

My initial thought is that since reducing the false positive rate
would only help when there was a high rate of conflicts under the
existing patch, and it would add code complexity and cost for the
case where conflict rate is low, that we might want to just leave
the current fix and see whether there are complaints from the field
about the false positive rate.

Reducing the rate of false positive serialization failures is a
worthy goal, but it's gotta make sense from a cost/benefit
perspective.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-general by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: out-of-order XID insertion in KnownAssignedXids
Next
From: Edilmar LISTAS
Date:
Subject: Re: Doubts about replication from many servers