BUG #11732: Non-serializable outcomes under serializable isolation - Mailing list pgsql-bugs

From pbailis@cs.berkeley.edu
Subject BUG #11732: Non-serializable outcomes under serializable isolation
Date
Msg-id 20141021071458.2678.9080@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #11732: Non-serializable outcomes under serializable isolation  (Peter Bailis <pbailis@cs.berkeley.edu>)
Re: BUG #11732: Non-serializable outcomes under serializable isolation  (Kevin Grittner <kgrittn@ymail.com>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      11732
Logged by:          Peter Bailis
Email address:      pbailis@cs.berkeley.edu
PostgreSQL version: 9.3.5
Operating system:   Ubuntu 14.04.1 LTS
Description:


When issuing a high-contention workload under SERIALIZABLE isolation, I am
able to produce non-serializable outcomes. I am able to reproduce this
behavior both via a Ruby on Rails deployment and via a standalone Python
script that generates concurrent transactions
(https://gist.github.com/pbailis/503430309104e3f7ab79#file-pg-stress-py).

BASIC WORKLOAD:

The basic workload consists of a series of concurrent transactions
performing read-modify-insert operations on a non-indexed varchar column.
The workload repeatedly issues a set of K transactions in parallel,
incrementing a sequence number for each wave. Each transaction within a wave
checks whether or not the non-indexed varchar column contains the sequence
number encoded as a string (via a SELECT WHERE query). If so, the query
rolls back. If not, attempts to insert a new row with the sequence number.
(This is effectively a poor man’s version of enforcing uniqueness
constraints.) Under SERIALIZABLE isolation, only one transaction should
successfully insert an entry into the varchar column per sequence number.

(Reasoning: If, within a wave, a transaction T_1 observes that another
transaction T_2 has already updated the column, it will not attempt to
insert. If, within a wave, a transaction T_1 does not observe another
transaction T_2's successful, committed insertion to the same sequence
number, T_1 should abort to preserve serializability and will not insert. In
the event that all transactions within a wave abort, there will be no
insertions to that sequence number.)

As a minimal working example, consider the following table, with the “key”
column holding the sequence number varchar column from above:

    TABLE "stress" ("id" serial primary key, "key" character varying(255))

and the following transaction, executed in parallel and parameterized by K,
a varchar:

    BEGIN TRANSACTION;
    result = 'SELECT 1 AS one FROM "stress"  WHERE "stress"."key" = K;'

    if result is not NULL:
        'INSERT INTO "stress" ("key") VALUES (K) RETURNING "id";'
        COMMIT TRANSACTION;
    else:
        ROLLBACK TRANSACTION;

(In both implementations I've built, the logic has been written in the
application, and the single-quoted queries run in PostgreSQL.)

EXPECTED BEHAVIOR: Under serializable isolation, there should only be one
record per 'key':

    SELECT key, COUNT(*) from stress GROUP BY key HAVING COUNT(*) > 1;
     key | count
    -----+-------
    (0 rows)

ACTUAL BEHAVIOR: Under serializable behavior, there are duplicates in the
"key" column.

    SELECT key, COUNT(*) from stress GROUP BY key HAVING COUNT(*) > 1;
     key | count
    -----+-------
     745 |     2
     274 |     2
     55  |     2
    (3 rows)

I realize this is a weird workload, but it's derived from a set of queries
in Rails's ORM.

ENVIRONMENT DETAILS:

I have reproduced this behavior in two separate environments:

PostgreSQL version():  PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu,
compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit
PostgreSQL installation method: vanilla install via apt ('postgres-9.3')
OS: Ubuntu 14.04 LTS
Hardware: EC2 m2.4xlarge instance with 8 cores, 68.4GB memory, and 840GB
local ephemeral storage

PostgreSQL version:  PostgreSQL 9.3.5 on x86_64-apple-darwin13.3.0, compiled
by Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn), 64-bit
PostgreSQL installation method: vanilla install via brew ('postgres')
OS: Apple OS X 10.9.4
Hardware: Macbook Pro, 2.8GHz Intel i5, 16GB RAM, SSD

Configuration changes: I've changed the default postgresql.conf
default_transaction_isolation to 'serializable' and increasing the number of
connections and memory to accommodate multiple parallel connections (in the
script I've provided, 128):
    default_transaction_isolation = 'serializable'
    max_connections = 200
    shared_buffers = 1GB

I originally thought this must be a bug in Rails, but I've now implemented
the transaction logic in both Rails 4 and Python (using psycopg2) and
reproduced this behavior in both environments. Autocommit is disabled, and
the logs (see
https://gist.github.com/pbailis/503430309104e3f7ab79#file-postgres-output-txt)
demonstrate that serializable isolation is enabled (i.e., "Reason code:
Canceled on identification as a pivot, during write"). This leads me to
believe it is a consequence of the SSI implementation.

REPRODUCING:

Set up a new PostgreSQL database. The Python script assumes a database
called 'stress' and a user called 'stress'. The following command will set
up a new database in '/tmp/pgdata':
   export PGDATA=/tmp/pgdata; rm -rf $PGDATA; pg_ctl init; printf
"default_transaction_isolation =
'serializable'\nshared_buffers=1GB\nmax_connections=200\n" >>
$PGDATA/postgresql.conf; postgres -D /tmp/pgdata & sleep 2; psql postgres -c
"CREATE USER stress WITH PASSWORD 'stress'"; psql postgres -c "CREATE
DATABASE stress"; psql postgres -c "GRANT ALL PRIVILEGES ON DATABASE stress
to stress"; fg

Run a series of concurrent sets of transactions as described above, or just
run the Python tool
(https://gist.github.com/pbailis/503430309104e3f7ab79#file-pg-stress-py).

Check whether the 'key' column contains duplicate entries (e.g.,
https://gist.github.com/pbailis/503430309104e3f7ab79#file-state-of-database-after-pg-stress-py).

This may take a few tries. I'd estimate that over half of my runs in both
environments (going through the whole setup -- database setup plus Python
script) reproduce this behavior. The current script is a bit of a hack,
leading to high variance in timing behavior.

OTHER NOTES:

I haven't had much luck reproducing with a non-varchar key (e.g., (key
integer)) and without RETURNING "id". Putting a unique index on "key" seems
to work as expected.
The behavior seems to appear more often when run on a new PostgreSQL
installation.
If the script hangs, check the contents of the table -- there may be a
duplicate. However, not all duplicates freeze the script.

I'm happy to debug further, but I'm sending what I have so far.

pgsql-bugs by date:

Previous
From: John R Pierce
Date:
Subject: Re: 32 bit libpq fail to connecting when set a very large "connect_timeout" value
Next
From: michel@albert.lu
Date:
Subject: BUG #11734: The "<<=" operator on "inet" values does not return the expected result.