Re: User-facing aspects of serializable transactions - Mailing list pgsql-hackers
From | Greg Stark |
---|---|
Subject | Re: User-facing aspects of serializable transactions |
Date | |
Msg-id | 4136ffa0906011646n2ab749bdk7a9a316b2692725a@mail.gmail.com Whole thread Raw |
In response to | Re: User-facing aspects of serializable transactions ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>) |
Responses |
Re: User-facing aspects of serializable transactions
("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
|
List | pgsql-hackers |
On Mon, Jun 1, 2009 at 11:07 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Greg Stark <stark@enterprisedb.com> wrote: > >> No, I'm not. I'm questioning whether a serializable transaction >> isolation level that makes no guarantee that it won't fire >> spuriously is useful. > > Well, the technique I'm advocating virtually guarantees that there > will be false positives, since it looks only for the "dangerous > structure" of two adjacent read-write dependences rather than building > a rigorous read-write dependency graph for every serializable > transaction. Even if you user very fine-grained locks (i.e., what > *columns* were modified in what rows) and had totally accurate > predicate locking, you would still get spurious rollbacks with this > technique. Yeah, I'm ok compromising on things like having updates on other columns or even no-op updates trigger serialization failures. For one thing they do currently, but more importantly from my point of view they can be explained in documentation and make sense from a user's point of view. More generally any time you have a set of transactions that are touching and selecting from the same set of records, I think it's obvious to a user that a serialization failure might be possible. I'm not happy having things like "where x = 5 and y = 5" randomly choose either to lock all records in one or the other index range (or the whole table) when only the intersection are really interesting to the plan. That leaves a careful programmer no way to tell which of his transactions might conflict. And I'm *really* unhappy with having the decision on which range to lock depend on the planner decision. That means sometime (inevitably in the middle of a night) the database will suddenly start getting serialization failures on transactions that never did before (inevitably critical batch jobs) because the planner switched plans. > In spite of that, I believe that it will run faster than traditional > serializable transactions, and in one benchmark it ran faster than > snapshot isolation -- apparently because it rolled back conflicting > transactions before they did updates and hit the update conflict > detection phase. "I can get the answer infinitely fast if it doesn't have to be right" I know a serialization failure isn't a fatal error and the application has to be prepared to retry. And I agree that some compromises are reasonable, "serialization failure" doesn't have to mean "the database ran a theorem prover and proved that it was impossible to serialize these transactions". But I think a programmer has to be able to look at the set of transactions and say "yeah I can see these transactions all depend on the same records". >> Postgres doesn't take block level locks or table level locks to do >> row-level operations. You can write code and know that it's safe >> from deadlocks. > > Who's talking about deadlocks? If you're speaking more broadly of all > serialization failures, you can certainly get them in PostgreSQL. So > one of us is not understanding the other here. To clarify what I'm > talking about -- this technique introduces no blocking and cannot > cause a deadlock. Sorry, I meant to type a second paragraph there to draw the analogy. Just as carefully written SQL code can be written to avoid deadlocks I would expect to be able to look at SQL code and know it's safe from serialization failures, or at least know where they might occur. -- greg
pgsql-hackers by date: