Re: Potential G2-item cycles under serializable isolation - Mailing list pgsql-bugs
From | Kyle Kingsbury |
---|---|
Subject | Re: Potential G2-item cycles under serializable isolation |
Date | |
Msg-id | 80e772f7-a46d-7bfd-28d0-4017bf5cd2e5@jepsen.io Whole thread Raw |
In response to | Re: Potential G2-item cycles under serializable isolation (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Potential G2-item cycles under serializable isolation
Re: Potential G2-item cycles under serializable isolation |
List | pgsql-bugs |
On 6/4/20 5:29 PM, Peter Geoghegan wrote: > I'd appreciate it if you could provide this information, so I can be > confident I didn't get something wrong. I don't really understand how > Elle detects this G2-Item anomaly, nor how it works in general. > PostgreSQL doesn't really use 2PL, even to a limited degree (unlike > Oracle), so a lot of the definitions from the "Generalized Isolation > Level Definitions"/Adya paper are not particularly intuitive to me. Hopefully you shouldn't have to think about 2PL, because the generalized phenomena are defined independently of locking--though I think the paper talks about 2PL in order to show equivalency to the locking approaches. The gist of the generalized definitions is about information flow--the anomalies (well, most of them) correspond to cycles in the dependency graph between transactions. Elle works (very loosely) by inferring this dependency graph. I think you've probably read this already, and I know it's a *lot* to throw out there all at once, but the Elle readme and paper might be helpful here. In particular, section 2 of the paper ("The Adya Formalism") gives a brief overview of what these dependencies are, and section 3 gives an intuition for how we can infer the dependency graph. https://github.com/jepsen-io/elle https://github.com/jepsen-io/elle/blob/master/paper/elle.pdf > That said, I find it easy to understand why the "G2-item: Item > Anti-dependency Cycles" example from the paper exhibits behavior that > would be wrong for Postgres -- even in repeatable read mode. Yeah! My understanding is that this behavior would be incorrect either under snapshot serializability or repeatable read, at least using the generalized definitions. It might be OK to do this under repeatable read given the anomaly interpretation of the ANSI SQL spec; not entirely sure. > If > Postgres exhibits this anomaly (in repeatable read more or > serializable mode), that would be a case of a transaction reading data > that isn't visible to its original transaction snapshot. The paper > supposes that this could happen when another transaction (the one that > updated the sum-of-salaries from the example) committed. Yes! It's not always this obvious--G2-item encompasses any dependency cycle between transactions such that at least one dependency involves a transaction writing state which was not observed by some (ostensibly prior) transaction's read. We call these "rw dependencies" in the paper, because they involve a read which must have occurred before a write. Another way to think of G2-item is "A transaction failed to see something that happened in its logical past". A special case of G2-item, G-single, is commonly known as read skew. In Elle, we tag G-single separately, so all the G2-item anomalies reported actually involve 2+ rw dependencies, not just 1+. I haven't seen G-single yet, which is good--that means Postgres isn't violating SI, just SSI. Or, of course, the test itself could be broken--maybe the SQL statements themselves are subtly wrong, or our inference is incorrect. > If each Jepsen worker has its own connection for the duration of the > test (which I guess must happen already), and each connection > specified an informative and unique "application_name", it would be > possible to see Jepsen's string from the Postgres logs, next to the > SQL text. Give Jepsen a138843d a shot! 1553 jepsen process 27 16 LOG: execute <unnamed>: select (val) from txn0 where sk = $1 1553 jepsen process 27 17 DETAIL: parameters: $1 = '9' Here, "process 27" is the same as the :process field you'll see in transactions. I'd like to be able to get a mini log of SQL statements embedded in the operation itself, so it'd be *right there* in the anomaly explanation, but... I haven't figured out how to scrape those side effects out of the guts of JDBC yet. --Kyle
pgsql-bugs by date: