Re: Serializable Isolation without blocking - Mailing list pgsql-hackers
From | Albe Laurenz |
---|---|
Subject | Re: Serializable Isolation without blocking |
Date | |
Msg-id | D960CB61B694CF459DCFB4B0128514C202FF65AF@exadv11.host.magwien.gv.at Whole thread Raw |
In response to | Serializable Isolation without blocking ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>) |
Responses |
Re: Serializable Isolation without blocking
(Gregory Stark <stark@enterprisedb.com>)
Re: Serializable Isolation without blocking ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>) |
List | pgsql-hackers |
Kevin Grittner wrote: > While discussing potential changes to PostgreSQL documentation of > transaction isolation levels, Emmanuel Cecchet pointed out an > intriguing new paper[1] on a new algorithm to provide true > serializable behavior in a MVCC based database, with no additional > blocking; although, there being no such things as a free lunch, there > is an increase in serialization failures under contention. I have read through the paper and will share my comments. I hope I got it right: To put it in a nutshell, the approach to true serializability presented in the paper involves "intention locks" which do not block writers, but cause transactions with conflict potential to be aborted. The main cost incurred is the maintenance of these intention locks, which need to be kept for a while even after a transaction has committed. Moreover, there will potentially be many of these locks (think of SELECT COUNT(*) FROM ...), so some lock escalation mechanism (to page or table locks) will be necessary. I don't know how hard that would be to implement, but I'd argue that it is only worth considering if it guarantees true serializability. The paper describes only intention locks for rows in the select list of a statement and deliberately ignores rows which are examined in the WHERE clause. The authors argue in section 3.4 that this is no problem in their implementation since "Berkeley DB does all locking and versioning on page granularity". I don't buy that; maybe I misunderstood something. Consider a function "makehighlander(personid integer) RETURNS void" defined like this: SELECT ishighlander INTO b FROM scots WHERE id=personid; IF b THEN RETURN; /* no need to do anything */ END IF; UPDATE scots SET ishighlander=TRUE WHERE id=personid; SELECT count(*) INTO n FROM scots WHERE ishighlander; IF (n > 1)THEN RAISE EXCEPTION 'There can be only one'; END IF; If we assume that "ishighlander" is false for all records in the beginning, and there are two calls to the function with two personid's of records *in different pages*, then there cannot be any conflicts since all (write and intention) locks taken by each of these calls should only affect the one page that contains the one record that is updated and then found in the subsequent SELECT. Yet if the two execute concurrently and the two first SELECTs are executed before the two UPDATEs, then both functions have a snapshot so that the final SELECT statements will return 1 and both functions will succeed, leaving the table with two highlanders. So I think one would have to add intention locks for rows considered in the WHERE clause to guarantee true serializability. It would be interesting to know how many lines of code would have to be added to implement that and how performance would be affected. I'm afraid that concurrency would suffer because more conflicting transactions would be aborted. Another thing I wonder is whether the authors have considered the possibility that there are serializable and "cursor stability" transactions at the same time, where the latter would not take intention locks. Could that affect the considerations about correctness of the algorithm? Yours, Laurenz Albe
pgsql-hackers by date: