Potential G2-item cycles under serializable isolation - Mailing list pgsql-bugs
From | Kyle Kingsbury |
---|---|
Subject | Potential G2-item cycles under serializable isolation |
Date | |
Msg-id | db7b729d-0226-d162-a126-8a8ab2dc4443@jepsen.io Whole thread Raw |
Responses |
Re: Potential G2-item cycles under serializable isolation
Re: Potential G2-item cycles under serializable isolation |
List | pgsql-bugs |
Hello everyone! First off, I'm sorry for *gestures vaguely* all of this. Second, I think I may have found a serializability violation in Postgres 12.3, involving anti-dependency cycles around row insertion. For background, my name is Kyle Kingsbury, and I test distributed database safety properties (https://jepsen.io). I started looking at Stolon + PostgreSQL this week, encountered this behavior, and managed to narrow it down to a single Postgres node without Stolon at all. Normally I test with a variety of faults (network partitions, crashes, etc.), but this behavior occurs in healthy processes without any faults. This test uses the Jepsen testing library (https://github.com/jepsen-io/jepsen) and the Elle isolation checker (https://github.com/jepsen-io/elle). If you're wondering "why would you ever do transactions like this", the Elle paper might provide some helpful background: https://arxiv.org/abs/2003.10554. We install Postgres 12.3-1.pgdg100+1 on a Debian 10 node, using the official Postgres repository at http://apt.postgresql.org/pub/repos/apt/. Each client uses its own JDBC connection, on a single thread, in a single JVM process. We use the JDBC postgres driver (org.postgresql/postgresql 42.2.12). The JDK is 1.8.0_40-b25. Logically, the test performs randomly generated transactions over a set of lists identified by integer keys. Each operation is either a read, which returns the current value of the list for a given key, or an append, which adds a unique element to the end of the list for a given key. In Postgres, we store these objects in tables like so: create table if not exists txn0 (id int not null primary key, sk int not null, val text) id is the key, and text stores comma-separated elements. sk is a secondary key, which is unused here. We create three tables like this (txn0, txn1, txn2). Records are striped across tables by hashing their key. We set the session transaction isolation level immediately after opening every connection: SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL SERIALIZABLE ... and request SERIALIZABLE for each JDBC transaction as well. Our reads are of the form: select (val) from txn0 where id = $1 And our writes are of the form: insert into txn1 as t (id, sk, val) values ($1, $2, $3) on conflict (id) do update set val = CONCAT(t.val, ',', $4) where t.id = $5 where $1 and $5 are the key, and $2, $3, and $4 are the element we'd like to append to the list. You can try these transactions for yourself using Jepsen f47eb25. You'll need a Jepsen environment; see https://github.com/jepsen-io/jepsen#setting-up-a-jepsen-environment for details. cd jepsen/stolon lein run test-all -w append --nemesis none --max-writes-per-key 8 --node n1 --just-postgres --concurrency 50 -r 1000 Which typically produces, after about a minute, anomalies like the following: G2-item #1 Let: T1 = {:type :ok, :f :txn, :value [[:r 7 [1]] [:append 12 1]], :time 95024280, :process 5, :index 50} T2 = {:type :ok, :f :txn, :value [[:append 7 2] [:r 14 nil] [:append 14 1] [:r 12 nil]], :time 98700211, :process 6, :index 70} Then: - T1 < T2, because T1 did not observe T2's append of 2 to 7. - However, T2 < T1, because T2 observed the initial (nil) state of 12, which T1 created by appending 1: a contradiction! A dependency graph of this anomaly is attached to this email: lines marked `rw` indicate read-write anti-dependencies between specific operations in each transaction. Because there are multiple rw edges in this graph, it suggests the presence of G2-item. It is also possible, of course, that worse anomalies happened (e.g. aborted read) which caused us to incorrectly infer this causal graph, but I suspect this is not the case. You can find a full copy of this particular test run, including a history of every transaction, Postgres logs, and a pcap file containing all client-server interactions, at http://jepsen.io.s3.amazonaws.com/analyses/postgresql-12.3/20200531T215019.000-0400.zip If you'd like to look at the test code, see https://github.com/jepsen-io/jepsen/tree/f47eb25ab32529a7b66f1dfdd3b5ef2fc84ed778/stolon/src/jepsen. Specifically, setup code is here: https://github.com/jepsen-io/jepsen/blob/f47eb25ab32529a7b66f1dfdd3b5ef2fc84ed778/stolon/src/jepsen/stolon/db.clj#L200-L233 ... and the workload responsible for constructing and submitting transactions is here: https://github.com/jepsen-io/jepsen/blob/f47eb25ab32529a7b66f1dfdd3b5ef2fc84ed778/stolon/src/jepsen/stolon/append.clj#L31-L107 These anomalies appear limited to G2-item: I haven't seen G-single (read skew), cyclic information flow, aborted reads, dirty writes, etc. It also looks as if every anomaly involves a *nil* read, which suggests (and I know the bug report guidelines say not to speculate, but experience suggests this might be helpful) that there is something special about row insertion. In TiDB, for instance, we found that G2-item anomalies with `select ... for update` was linked to the fact that TiDB's lock manager couldn't lock keys which hadn't been created yet (https://jepsen.io/analyses/tidb-2.1.7#select-for-update). I don't know anything about Postgres' internals, but I hope this is of some use! Is this... known behavior? Unexpected? Are there configuration flags or client settings I should double-check? I know this is all a bit much, so I'm happy to answer any questions you might have. :-) Sincerely, --Kyle
Attachment
pgsql-bugs by date: