Re: Compatibility GUC for serializable - Mailing list pgsql-hackers

From Florian Pflug
Subject Re: Compatibility GUC for serializable
Date
Msg-id 7092C720-4250-41DD-A8DC-1F1D5D8453E1@phlo.org
Whole thread Raw
In response to Re: Compatibility GUC for serializable  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On Jan10, 2011, at 20:29 , Josh Berkus wrote:
> The only reason I'm ambivalent about
> this is I'm unsure that there are more than a handful of people using
> SERIALIZABLE in production applications, precisely because it's been so
> unintuitive in the past.

I've used it quite extensively in the past. Usually either to run
two consecutive queries with the same snapshot, or to run an
UPDATE .. FROM (since that can be quite a foot-gun in READ COMMITTED
mode). 

In retrospect, I should have used REPEATABLE READ instead of
SERIALIZABLE, of course. But I didn't, and the part of the reason
for that is our very own documentation. The way it is written
gives the impression that SERIALIZABLE is the "real" name of the
isolation leven while REPEATABLE READ is a compatibility synonym,
and it also leads one to believe that true serializability isn't
something that will ever be implemented.

The two sections of "13.2 Transaction Isolation" dealing with
the isolation levels READ COMMITTED and REPEATABLE READ/SERIALIZABLE
are for example "13.2.1 Read Committed Isolation Level" and
"13.2.2 Serializable Isolation Level". 

And, at the end of 13.2, in the section about SERIALIZABLE vs.
true serializability, we say

"To guarantee true mathematical serializability, it is necessary
for a database system to enforce predicate locking, ....
Such a locking system is complex to implement and extremely expensive
in execution, ....  And this large expense is mostly wasted, since in
practice most applications do not do the sorts of things that could
result in problems. ...  For these reasons, PostgreSQL does not
implement predicate locking."

I'd be very surprised if nearly all out our users used only READ
COMMITTED isolation level. And given the wording of our documentation,
I'm quite certain I'm not the only one who used to spell that other
isolation leven SERIALIZABLE, not REPEATABLE READ.

The question thus (again) comes to to whether we believe that for
virtually all of these users, true serializability is either an improvement
or at least no regression. I cannot come up with a case where that wouldn't
be the case *in theory*. In practice, however, Kevin's patch includes some
performance vs. false-positives trade-offs, and these *have* the potential
of biting people.

So, to summarize, I believe we need to look at the trade-offs - for
example the way conflict information is summarized to prevent memory
overflow - to judge whether there are any realistic workloads where
those might cause problems.

best regards,
Florian Pflug



pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: system views for walsender activity
Next
From: Simon Riggs
Date:
Subject: Re: system views for walsender activity