Re: Potential G2-item cycles under serializable isolation - Mailing list pgsql-bugs
From | Peter Geoghegan |
---|---|
Subject | Re: Potential G2-item cycles under serializable isolation |
Date | |
Msg-id | CAH2-Wz=WK-yvYhhJnu+3fgraw_zd23275pAKKk0PCdEQvU0U=w@mail.gmail.com Whole thread Raw |
In response to | Re: Potential G2-item cycles under serializable isolation (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: Potential G2-item cycles under serializable isolation
|
List | pgsql-bugs |
On Fri, Jun 12, 2020 at 9:49 PM Thomas Munro <thomas.munro@gmail.com> wrote: > Slightly improved version, bringing some wording into line with > existing documentation. s/SQL Standard/SQL standard/, and explicitly > referring to "locking" implementations of RR and Ser (as we do already > a few paragraphs earlier, when discussing MVCC). My intention is to > push this to all branches in a couple of days if there is no other > feedback. I propose to treat it as a defect, because I agree that > it's weird and surprising that we don't mention SI, especially > considering the history of the standard levels. That seems reasonable to me. The doc patch says "some other systems may even offer Repeatable Read and Snapshot Isolation as distinct isolation levels". This is true of SQL Server, which has an explicit snapshot isolation level, though possibly only because SI was retrofitted to SQL Server 2005 -- I guess that that's what you have in mind. Perhaps you should reword this a little to convey that you mean a separate isolation mode *and* an actually-distinct set of behaviors to go with it. > I mean, I guess it's > basically implied by all the stuff that section says about MVCC vs > traditional locking systems, and it's a super well known fact in our > hacker community, but not using the standard term of art is a strange > omission. Yes, it's definitely an omission. > In future release perhaps we could entertain ideas like accepting the > name SNAPSHOT ISOLATION, and writing some more use-friendly guidance, > and possibly even reference the Generalized Isolation Level > Definitions stuff. I think it'd be a bad idea to stop accepting > REPEATABLE READ and inconvenience our users, though; IMHO it's > perfectly OK to stick with the current interpretation of the spec > while also acknowledging flaws and newer thinking through this new > paragraph. Kyle made the following observation about "true" RR at one point: "If you use the broad/generalized interpretation, you can trust that a program which only accesses data by primary key, running under repeatable read, is actually serializable". However, the simple fact is that it just wouldn't be sensible for Postgres to offer an according-to-Adya REPEATABLE READ mode that isn't just an alias for SERIALIZABLE/SSI. What does the leeway that RR gives us compared to SERIALIZABLE actually buy the implementation, performance-wise? This is similar to the situation with READ UNCOMMITED and READ COMMITTED in Postgres. Even though I am sympathetic to Kyle's concerns here, and am all in favor of this documentation patch, I believe that the current naming scheme is the least-worst option. The point of using the standard names is to make it easy to understand the guarantees on offer. But we need to call the current REPEATABLE READ behavior *something*, and the closest thing that the standard has is RR. Unless we're willing to make no distinction between SSI and snapshot isolation -- which we're clearly not. Or unless we're willing to exclusively use some non-standard name instead -- which is also clearly a non-starter (though I'd be fine with an alias). Speaking of comparing behaviors across systems, the behavior that MySQL calls REPEATABLE READ mode is actually READ COMMITTED, while the behavior that DB2 calls REPEATABLE READ is actually SERIALIZABLE. The range of guarantees that you actually get are *enormous* in reality! ISTM that REPEATABLE READ is by far the most confusing and ambiguous isolation level that the standard describes. Even Elle can't tell the difference between "true" REPEATABLE READ mode and SERIALIZABLE mode; why should Postgres users be able to? -- Peter Geoghegan
pgsql-bugs by date: