On Tue, Jun 2, 2020 at 9:19 AM Kyle Kingsbury <aphyr@jepsen.io> wrote:
> OK! So I've designed a variant of this test which doesn't use ON CONFLICT.
> Instead, we do a homebrew sort of upsert: we try to update the row in place by
> primary key; if we see zero records updated, we insert a new row, and if *that*
> fails due to the primary key conflict, we try the update again, under the theory
> that since we now know a copy of the row exists, we should be able to update it.
>
>
https://github.com/jepsen-io/jepsen/blob/f47eb25ab32529a7b66f1dfdd3b5ef2fc84ed778/stolon/src/jepsen/stolon/append.clj#L31-L108
Thanks, but I think that this link is wrong, since you're still using
ON CONFLICT. Correct me if I'm wrong, I believe that you intended to
link to this:
https://github.com/jepsen-io/jepsen/commit/ac4956871c8227d57d11a665e43c3d68bb7d7ec1#diff-0f5b390b5cdbd8650cf39e3c3f6f365fR31-R65
> Unfortunately, I'm still seeing tons of G2-item cycles. Whatever this is, it's
> not related to ON CONFLICT.
Good to have that confirmed. Obviously we'll need to do more analysis
of the exact circumstances of the anomaly. That might take a while.
> I get the sense that the Postgres docs have already diverged from the ANSI SQL
> standard a bit, since SQL 92 only defines three anomalies (P1, P2, P3), and
> Postgres defines a fourth: "serialization anomaly".
> I can see two ways to reconcile this--one being that Postgres chose the anomaly
> interpretation of the SQL spec, and the result is... maybe internally
> inconsistent? Or perhaps one of the operations in this workload actually *is* a
> predicate operation--maybe by dint of relying on a uniqueness constraint?
You might find that "A Critique of ANSI SQL Isolation Levels" provides
useful background information:
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-95-51.pdf
One section in particular may be of interest:
"ANSI SQL intended to define REPEATABLE READ isolation to exclude all
anomalies except Phantom. The anomaly definition of Table 1 does not
achieve this goal, but the locking definition of Table 2 does. ANSI’s
choice of the term Repeatable Read is doubly unfortunate: (1)
repeatable reads do not give repeatable results, and (2) the industry
had already used the term to mean exactly that: repeatable reads mean
serializable in several products. We recommend that another term be
found for this."
--
Peter Geoghegan