Thread: Compatibility GUC for serializable

Compatibility GUC for serializable

From
"Kevin Grittner"
Date:
There's an issue where we don't seem to have consensus yet, so I
figured I'd bounce it off the list.
If the SSI patch were to be accepted as is, REPEATABLE READ would
continue to provide the exact same snapshot isolation behavior which
both it and SERIALIZABLE do through 9.0, and SERIALIZABLE would
always use SSI on top of the snapshot isolation to prevent
serialization anomalies.  In his review, Jeff argued for a
compatibility GUC which could be changed to provide legacy behavior
for SERIALIZABLE transactions -- if set, SERIALIZABLE would fall back
to working the same as REPEATABLE READ.
In an off-list exchange with me, David Fetter expressed opposition to
this, as a foot-gun.  I'm not sure where anyone else stands on this. 
Personally, I don't care a whole lot because it's trivial to add, so
that seems to leave the vote at 1 to 1.  Anyone else care to tip the
scales?
-Kevin



Re: Compatibility GUC for serializable

From
David Fetter
Date:
On Sun, Jan 09, 2011 at 12:07:49PM -0600, Kevin Grittner wrote:
> There's an issue where we don't seem to have consensus yet, so I
> figured I'd bounce it off the list.
>  
> If the SSI patch were to be accepted as is, REPEATABLE READ would
> continue to provide the exact same snapshot isolation behavior which
> both it and SERIALIZABLE do through 9.0, and SERIALIZABLE would
> always use SSI on top of the snapshot isolation to prevent
> serialization anomalies.  In his review, Jeff argued for a
> compatibility GUC which could be changed to provide legacy behavior
> for SERIALIZABLE transactions -- if set, SERIALIZABLE would fall
> back to working the same as REPEATABLE READ.
>  
> In an off-list exchange with me, David Fetter expressed opposition
> to this, as a foot-gun.  I'm not sure where anyone else stands on
> this.  Personally, I don't care a whole lot because it's trivial to
> add, so that seems to leave the vote at 1 to 1.  Anyone else care to
> tip the scales?

For what it's worth, that exchange started with my proposing a
separate SNAPSHOT isolation, but since we'll already providing that
isolation level and calling it REPEATABLE READ, I figured we didn't
need an extra one that did the exact same thing. :)

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Compatibility GUC for serializable

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> If the SSI patch were to be accepted as is, REPEATABLE READ would
> continue to provide the exact same snapshot isolation behavior which
> both it and SERIALIZABLE do through 9.0, and SERIALIZABLE would
> always use SSI on top of the snapshot isolation to prevent
> serialization anomalies.  In his review, Jeff argued for a
> compatibility GUC which could be changed to provide legacy behavior
> for SERIALIZABLE transactions -- if set, SERIALIZABLE would fall back
> to working the same as REPEATABLE READ.

> In an off-list exchange with me, David Fetter expressed opposition to
> this, as a foot-gun.

I think we've learned over the years that GUCs that significantly change
semantics can be foot-guns.  I'm not sure exactly how dangerous this one
would be, but on the whole I'd prefer to avoid introducing a GUC here.
        regards, tom lane


Re: Compatibility GUC for serializable

From
Robert Haas
Date:
On Sun, Jan 9, 2011 at 7:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> If the SSI patch were to be accepted as is, REPEATABLE READ would
>> continue to provide the exact same snapshot isolation behavior which
>> both it and SERIALIZABLE do through 9.0, and SERIALIZABLE would
>> always use SSI on top of the snapshot isolation to prevent
>> serialization anomalies.  In his review, Jeff argued for a
>> compatibility GUC which could be changed to provide legacy behavior
>> for SERIALIZABLE transactions -- if set, SERIALIZABLE would fall back
>> to working the same as REPEATABLE READ.
>
>> In an off-list exchange with me, David Fetter expressed opposition to
>> this, as a foot-gun.
>
> I think we've learned over the years that GUCs that significantly change
> semantics can be foot-guns.  I'm not sure exactly how dangerous this one
> would be, but on the whole I'd prefer to avoid introducing a GUC here.

I agree.  I think we should assume that existing code which asks for
serializable behavior wants serializable behavior, not broken
serializable behavior.  There certainly could be cases where the
opposite is true (the code wants, specifically, our traditional
definition of serializability rather than actual serializability) but
I bet there's not a whole lot of them, and changing such code to ask
for REPEATABLE READ probably isn't extremely difficult.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Compatibility GUC for serializable

From
Josh Berkus
Date:
On 1/9/11 5:27 PM, Robert Haas wrote:
> I agree.  I think we should assume that existing code which asks for
> serializable behavior wants serializable behavior, not broken
> serializable behavior.  There certainly could be cases where the
> opposite is true (the code wants, specifically, our traditional
> definition of serializability rather than actual serializability) but
> I bet there's not a whole lot of them, and changing such code to ask
> for REPEATABLE READ probably isn't extremely difficult.

I'm going to disagree here. For a large, sprawling, legacy application
changing SERIALIZABLE to REPEATABLE READ in every place in the code
which might call it can be prohibitively difficult.  Further, many such
applications would be written with workarounds for broken serializable
behavior, workarounds which would behave unpredictably after an upgrade.

As such, I'd tend to say that like other major behavior changes, we
ought to have a LEGACY_SERIALIZABLE GUC for a couple of versions,
defaulting to "FALSE".  Otherwise SSI becomes an anti-feature for some
users and prevents them from upgrading.

On the other hand, I'm not sure how many users ever use SERIALIZABLE
mode.  That would be the main counter-argument.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Compatibility GUC for serializable

From
"Kevin Grittner"
Date:
Josh Berkus <josh@agliodbs.com> wrote:
> many such applications would be written with workarounds for
> broken serializable behavior, workarounds which would behave
> unpredictably after an upgrade.
Can you elaborate?
The techniques we use in our shop wouldn't interact badly with SSI,
and I'm having trouble picturing what would.  Sure, some of these
techniques would no longer be needed, and would only add overhead if
SSI was there.  They would generally tend to prevent code from
getting to the point where a serialization failure from SSI would
occur.  In spite of that there would probably be at least some
additional serialization failures.  What other interactions or
problems do you see?
-Kevin


Re: Compatibility GUC for serializable

From
Josh Berkus
Date:
On 1/10/11 10:28 AM, Kevin Grittner wrote:
> The techniques we use in our shop wouldn't interact badly with SSI,
> and I'm having trouble picturing what would.  Sure, some of these
> techniques would no longer be needed, and would only add overhead if
> SSI was there.

Yeah?  Well, you have more experience than I do in this; my clients have
tended to use SELECT FOR UPDATE instead of SERIALIZABLE.  I'll defer to
you if you feel reasonably confident that breakage won't result.

And as I said, I'm unsure of how many people are using SERIALIZABLE in
any mission-critical context right now.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Compatibility GUC for serializable

From
"Kevin Grittner"
Date:
Josh Berkus <josh@agliodbs.com> wrote:
> my clients have tended to use SELECT FOR UPDATE instead of
> SERIALIZABLE.
If they're not using SERIALIZABLE, this patch will have no impact on
them at all.  If they are using SELECT FOR UPDATE *with*
SERIALIZABLE, everything will function exactly as it is except that
there may be some serialization failures which they weren't getting
before, either from the inevitable (but hopefully minimal) false
positives inherent in the technique or because they missed covering
something.
Since SSI doesn't introduce any blocking, and causes no behavior
changes beyond triggering serialization failures when it seems that
an anomaly may otherwise result, there's really nothing else to go
wrong.
Well, if there are no bugs we've missed in these few thousand lines
of code, that is.  Given the size and complexity of the patch, it'd
be surprising if we've squashed them all just yet.  We've tried....
-Kevin


Re: Compatibility GUC for serializable

From
Josh Berkus
Date:
On 1/10/11 10:47 AM, Kevin Grittner wrote:
> If they're not using SERIALIZABLE, this patch will have no impact on
> them at all.  If they are using SELECT FOR UPDATE *with*
> SERIALIZABLE, everything will function exactly as it is except that
> there may be some serialization failures which they weren't getting
> before, either from the inevitable (but hopefully minimal) false
> positives inherent in the technique or because they missed covering
> something.

Right, that's what I'm worried about.  That's the sort of thing which is
very hard for a user to hunt down and troubleshoot, and could become a
blocker to upgrading.  Especially if they user has a vendor application
where they *can't* fix the code.  The only reason I'm ambivalent about
this is I'm unsure that there are more than a handful of people using
SERIALIZABLE in production applications, precisely because it's been so
unintuitive in the past.

Lemme start a survey on whether people use SERIALIZABLE.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Compatibility GUC for serializable

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think we've learned over the years that GUCs that significantly
> change semantics can be foot-guns.  I'm not sure exactly how
> dangerous this one would be
I didn't respond to this at first because the idea seemed DOA, but
with Josh's concerns I guess I should answer this question.
With the patch, SERIALIZABLE transactions run exactly as they did
before, and as REPEATABLE READ continue to run, except that they are
monitored for read-write conflict patterns which can cause
serialization anomalies.  This monitoring doesn't introduce any new
blocking.  The only behavior change is that there are additional
serialization failures when the monitoring detects dangerous
structures in the rw-conflicts among transactions.  The proposed GUC
would suppress the monitoring in SERIALIZABLE mode and avoid the new
serialization failures, thereby providing legacy behavior --
anomalies and all.
-Kevin


Re: Compatibility GUC for serializable

From
"Kevin Grittner"
Date:
I wrote:
> The proposed GUC would suppress the monitoring in SERIALIZABLE
> mode and avoid the new serialization failures, thereby providing
> legacy behavior -- anomalies and all.
After posting that I realized that there's no technical reason that
such a GUC couldn't be set within each session as desired, as long
as we disallowed changes after the first snapshot of a transaction
was acquired.  The IsolationIsSerializable() macro could be modified
to use that along with XactIsoLevel.
Really, the biggest risk of such a GUC is the confusion factor when
supporting people.  If we're told that the transactions involved in
some scenario were all run at the SERIALIZABLE isolation level, we
would need to wonder how many *really* were, and how many were (as
David put it) at the NOTREALLYSERIALIZABLEBUTLABELEDASSERIALIZABLE
isolation level?
-Kevin


Re: Compatibility GUC for serializable

From
Jeff Davis
Date:
On Mon, 2011-01-10 at 11:29 -0800, Josh Berkus wrote:
> On 1/10/11 10:47 AM, Kevin Grittner wrote:
> > If they're not using SERIALIZABLE, this patch will have no impact on
> > them at all.  If they are using SELECT FOR UPDATE *with*
> > SERIALIZABLE, everything will function exactly as it is except that
> > there may be some serialization failures which they weren't getting
> > before, either from the inevitable (but hopefully minimal) false
> > positives inherent in the technique or because they missed covering
> > something.
> 
> Right, that's what I'm worried about.

If we must have a GUC, perhaps we could publish a sunset one release in
the future.

Regards,Jeff Davis



Re: Compatibility GUC for serializable

From
Josh Berkus
Date:
> If we must have a GUC, perhaps we could publish a sunset one release in
> the future.

I was thinking default to false/off in 9.1, and disappear in 9.3.

> Really, the biggest risk of such a GUC is the confusion factor when
> supporting people.  If we're told that the transactions involved in
> some scenario were all run at the SERIALIZABLE isolation level, we
> would need to wonder how many *really* were, and how many were (as
> David put it) at the NOTREALLYSERIALIZABLEBUTLABELEDASSERIALIZABLE
> isolation level?

How is this different from our other backwards-compatibility GUCs?

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Compatibility GUC for serializable

From
Robert Haas
Date:
On Mon, Jan 10, 2011 at 1:17 PM, Josh Berkus <josh@agliodbs.com> wrote:
> I'm going to disagree here. For a large, sprawling, legacy application
> changing SERIALIZABLE to REPEATABLE READ in every place in the code
> which might call it can be prohibitively difficult.

What makes you think that would be necessary?  That'd require someone
(a) using serializable, and (b) wanting it to be broken?  I think the
most common reaction would be "thank goodness, this thing actually
works now".

> Further, many such
> applications would be written with workarounds for broken serializable
> behavior, workarounds which would behave unpredictably after an upgrade.

Uh...  you want to support that with an example?  Because my first
reaction is "that's FUD".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Compatibility GUC for serializable

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> How is this different from our other backwards-compatibility GUCs?

Mainly, that it's not clear we need it.  Nobody's pointed to a concrete
failure mechanism that makes it necessary for an existing app to run
under fake-SERIALIZABLE mode.
        regards, tom lane


Re: Compatibility GUC for serializable

From
"Kevin Grittner"
Date:
Josh Berkus  wrote:
>> Really, the biggest risk of such a GUC is the confusion factor
>> when supporting people.
> How is this different from our other backwards-compatibility GUCs?
I thought Tom might be concerned about such a GUC destabilizing
things in other ways.  I just wanted to make clear how unlikely that
was in this case.  I agree that the risk of confusion in support is
always there with a backwards-compatibility GUC.
I'm still not taking a position either way on this, since I can see
the merit of both arguments and it has little impact on me,
personally.  I'm just trying to be up-front about things so people
can make an informed decision.
-Kevin


Re: Compatibility GUC for serializable

From
Josh Berkus
Date:
> Mainly, that it's not clear we need it.  Nobody's pointed to a concrete
> failure mechanism that makes it necessary for an existing app to run
> under fake-SERIALIZABLE mode.

I think it's quite possible that you're right, and nobody depends on
current SERIALIZABLE behavior because it's undependable.  However, we
don't *know* that -- most of our users aren't on the mailing lists,
especially those who use packaged vendor software.

That being said, the case for a backwards-compatiblity GUC is weak, and
I'd be ok with not having one barring someone complaining during beta,
or survey data showing that there's more SERIALIZABLE users than we think.

Oh, survey:
http://www.postgresql.org/community/

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Compatibility GUC for serializable

From
Pavel Stehule
Date:
2011/1/11 Robert Haas <robertmhaas@gmail.com>:
> On Mon, Jan 10, 2011 at 1:17 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> I'm going to disagree here. For a large, sprawling, legacy application
>> changing SERIALIZABLE to REPEATABLE READ in every place in the code
>> which might call it can be prohibitively difficult.
>
> What makes you think that would be necessary?  That'd require someone
> (a) using serializable, and (b) wanting it to be broken?  I think the
> most common reaction would be "thank goodness, this thing actually
> works now".

it works, but not works perfect. Some "important" toolkit like
performance benchmarks doesn't work with PostgreSQL without failures.
It's one reason why PostgreSQL has less score in some enterprise
rating than MySQL. It working for current user, but it not works well
for users who should do decision for migration to PostgreSQL. I don't
see a problem in GUC, but it isn't a problem - more significant
problem is current PostgreSQL's serializable implementation in general
(that should work on more SQL servers) applications. It's a break for
one class of customers.

Regards

Pavel Stehule

>
>> Further, many such
>> applications would be written with workarounds for broken serializable
>> behavior, workarounds which would behave unpredictably after an upgrade.
>
> Uh...  you want to support that with an example?  Because my first
> reaction is "that's FUD".
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: Compatibility GUC for serializable

From
Florian Pflug
Date:
On Jan10, 2011, at 23:56 , Kevin Grittner wrote:
>> The proposed GUC would suppress the monitoring in SERIALIZABLE
>> mode and avoid the new serialization failures, thereby providing
>> legacy behavior -- anomalies and all.
>
> After posting that I realized that there's no technical reason that
> such a GUC couldn't be set within each session as desired, as long
> as we disallowed changes after the first snapshot of a transaction
> was acquired.  The IsolationIsSerializable() macro could be modified
> to use that along with XactIsoLevel.

From a security point of view, it seems dangerous to allow
such a GUC to be set by non-superusers. It might allow users to
e.g. circumvent some access control scheme by exploiting a race
condition that only exists without true serializability.

The risk of confusion is also much higher if such a thing can be
set per-session.

So, if we need such a GUC at all, which I'm not sure we do, I
believe it should be settable only from postgresql.conf and the
command line.

best regards,
Florian Pflug



Re: Compatibility GUC for serializable

From
Florian Pflug
Date:
On Jan10, 2011, at 20:29 , Josh Berkus wrote:
> The only reason I'm ambivalent about
> this is I'm unsure that there are more than a handful of people using
> SERIALIZABLE in production applications, precisely because it's been so
> unintuitive in the past.

I've used it quite extensively in the past. Usually either to run
two consecutive queries with the same snapshot, or to run an
UPDATE .. FROM (since that can be quite a foot-gun in READ COMMITTED
mode). 

In retrospect, I should have used REPEATABLE READ instead of
SERIALIZABLE, of course. But I didn't, and the part of the reason
for that is our very own documentation. The way it is written
gives the impression that SERIALIZABLE is the "real" name of the
isolation leven while REPEATABLE READ is a compatibility synonym,
and it also leads one to believe that true serializability isn't
something that will ever be implemented.

The two sections of "13.2 Transaction Isolation" dealing with
the isolation levels READ COMMITTED and REPEATABLE READ/SERIALIZABLE
are for example "13.2.1 Read Committed Isolation Level" and
"13.2.2 Serializable Isolation Level". 

And, at the end of 13.2, in the section about SERIALIZABLE vs.
true serializability, we say

"To guarantee true mathematical serializability, it is necessary
for a database system to enforce predicate locking, ....
Such a locking system is complex to implement and extremely expensive
in execution, ....  And this large expense is mostly wasted, since in
practice most applications do not do the sorts of things that could
result in problems. ...  For these reasons, PostgreSQL does not
implement predicate locking."

I'd be very surprised if nearly all out our users used only READ
COMMITTED isolation level. And given the wording of our documentation,
I'm quite certain I'm not the only one who used to spell that other
isolation leven SERIALIZABLE, not REPEATABLE READ.

The question thus (again) comes to to whether we believe that for
virtually all of these users, true serializability is either an improvement
or at least no regression. I cannot come up with a case where that wouldn't
be the case *in theory*. In practice, however, Kevin's patch includes some
performance vs. false-positives trade-offs, and these *have* the potential
of biting people.

So, to summarize, I believe we need to look at the trade-offs - for
example the way conflict information is summarized to prevent memory
overflow - to judge whether there are any realistic workloads where
those might cause problems.

best regards,
Florian Pflug



Re: Compatibility GUC for serializable

From
Ron Mayer
Date:
Josh Berkus wrote:
>> Mainly, that it's not clear we need it.  Nobody's pointed to a concrete
>> failure mechanism that makes it necessary for an existing app to run
>> under fake-SERIALIZABLE mode.
> 
> I think it's quite possible that you're right, and nobody depends on
> current SERIALIZABLE behavior because it's undependable.  However, we
> don't *know* that -- most of our users aren't on the mailing lists,
> especially those who use packaged vendor software.
> 
> That being said, the case for a backwards-compatiblity GUC is weak, and
> I'd be ok with not having one barring someone complaining during beta,
> or survey data showing that there's more SERIALIZABLE users than we think.
> 
> Oh, survey:
> http://www.postgresql.org/community/
> 

That Survey's missing one important distinction for that discussion.

Do you take the the current survey answer
  "Yes, we depend on it for production code"

to imply
  "Yes, we depend on actual real SERIALIZABLE transactions in   production and will panic if you tell us we're not
gettingthat"
 

or
  "Yes, we depend on the legacy not-quite SERIALIZABLE transactions   in production and don't want real serializable
transactions"


Re: Compatibility GUC for serializable

From
"Kevin Grittner"
Date:
Ron Mayer <rm_pg@cheapcomplexdevices.com> wrote:
> That Survey's missing one important distinction for that
> discussion.
> 
> Do you take the the current survey answer
> 
>    "Yes, we depend on it for production code"
> 
> to imply
> 
>    "Yes, we depend on actual real SERIALIZABLE transactions in
>     production and will panic if you tell us we're not getting
>     that"
> 
> or
> 
>    "Yes, we depend on the legacy not-quite SERIALIZABLE
>     transactions in production and don't want real serializable
>     transactions"
Yeah, I was reluctant to reply to that survey because we rely on it
to the extent that it works now, but it would not break anything if
we dropped in a real SERIALIZABLE implementation.  I fear that
choosing the "depend on it" answer would imply "don't want changes".
-Kevin


Re: Compatibility GUC for serializable

From
Josh Berkus
Date:
Kevin,

I think you overestimate what we can meaninfully put in a tiny
radio-button survey.

I'm only trying to get a straw poll idea of whether we have lots of
people using SERIALIZABLE mode *at all*, or (as I suspect) almost none.If we get < 5% or respondees saying "we use it
inproduction" then I
 
think we can assume that backwards compatibility isn't worth discussing.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com