Thread: Documenting serializable vs snapshot isolation levels

Documenting serializable vs snapshot isolation levels

From
"Kevin Grittner"
Date:
I'm still casting about to make sure I have my head around the issues
adequately to suggest a documentation update.  Here's my current
understanding.
The below is intended to help define the nature and scope of the
issue, not be the sort of text that belongs in user documentation.
Assume transactions T0, T1, and TN, where TN can be T1 or a different
transaction, and T0 is concurrent with T1 and TN.
Assume non-overlapping logical sets of data x and y (either or both of
which can be empty or not).
Differences in behavior between what is required by the standard for
serializable transactions and what occurs under snapshot isolation can
manifest when T0 reads x and modifies y based on what was read from x,
T1 modifies x in a way that would affect T0's modification to y, TN
reads y, and T1's modifications are visible to TN, either because it
is the same transaction or because T1 committed before TN got its
snapshot.
I think that transient SELECT anomalies can occur if TN also reads x
without modifying anything based on y.  My example of an incomplete
receipt journal is a case of this.
I think that persistent data integrity can be compromised when TN
modifies data based on y.  The example currently in the documentation
is a special case of this, where the modified data is part of x. 
Another example would be where someone attempted to maintain
referential integrity using snapshot isolation without additional
locks, T1 is TN, and one transaction checked for children before
deleting the parent while the other checked for a parent before
inserting a child.  Many other types of integrity enforcement would
fail if serializable behavior is assumed, such as ensuring the
existence of some minimum number of rows which meet certain criteria
(e.g., staffing) or ensuring that the sum of some numeric column for a
range of rows stays above a minimum amount (e.g., banking).
My initial intuition that simply applying the same locks which
guarantee serializable behavior in a non-snapshot database to a
snapshot database would guarantee integrity is just plain wrong. 
There are a number of academic papers suggesting techniques for
handling these issues.  Unfortunately, what I've seen so far suggests
that there are three main approaches available under PostgreSQL, all
with significant cost of one type or another.
(1)  A rigorous analysis of all queries allowed against the database
to determine where transaction conflicts can occur, with expertise
needed to resolve each.  Downsides are that ad hoc queries can cause
anomalies and that when new queries are introduced a time-intensive
review of all queries may need to be done, and locking changes may be
needed in programs which weren't otherwise modified.
(2)  A less rigorous examination might suffice, if rather brutal
table-level locks are applied mechanically.  Even this requires
knowledge of what's happening outside the area where the change is
being made.  If the receipt example is solved by adding a table lock
on the receipt table to the code which updates the control record, the
control record update must be modified if some other code modifies
some new table (say, receipt_voids) based on looking at the control
record to determine what to do.
(3)  A finer-grained approach would be to make no-effect updates to
rows to lock them if they are to be read for purposes of updating
something else in the transaction.  This could have a high cost in
disk access and table bloat.  It has the advantage of providing a
simple technique which, if applied consistently, doesn't require
knowledge of software beyond what is under development.
Of course, if the anomalies are infrequent enough and of a nature
which can be tolerated, the whole issue can be ignored, or monitored
for manual correction.
Any suggestions for refining the description of where the anomalies
can occur or how best to work around any particular classes of them
are welcome.  I'm particularly interested in practical methodologies
for ensuring that no holes are left for failures of business rules to
apply.  There are so many papers on the topic that I'm not sure where
to go next.
I hope someone can show me something good I've missed so far.
I haven't lost track of the language suggested by Robert Haas, which I
think frames the issue nicely.  I just want to follow it with a
reasonable description of where it's an issue and how to handle it.
-Kevin


Re: Documenting serializable vs snapshot isolation levels

From
Gregory Stark
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

> (3)  A finer-grained approach would be to make no-effect updates to
> rows to lock them if they are to be read for purposes of updating
> something else in the transaction.  This could have a high cost in
> disk access and table bloat.  It has the advantage of providing a
> simple technique which, if applied consistently, doesn't require
> knowledge of software beyond what is under development.

"no-effect updates" would be just the same as SELECT FOR UPDATE

However this has the same problem that we previously discussed where someone
can still add new records which would have changed the results of the query.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication
support!


Re: Documenting serializable vs snapshot isolation levels

From
"Robert Haas"
Date:
On Mon, Dec 29, 2008 at 9:28 PM, Gregory Stark <stark@enterprisedb.com> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> (3)  A finer-grained approach would be to make no-effect updates to
>> rows to lock them if they are to be read for purposes of updating
>> something else in the transaction.  This could have a high cost in
>> disk access and table bloat.  It has the advantage of providing a
>> simple technique which, if applied consistently, doesn't require
>> knowledge of software beyond what is under development.
>
> "no-effect updates" would be just the same as SELECT FOR UPDATE

...except that SELECT FOR UPDATE won't create table bloat, or as much
I/O... I think?

> However this has the same problem that we previously discussed where someone
> can still add new records which would have changed the results of the query.

...Robert


Re: Documenting serializable vs snapshot isolation levels

From
"Kevin Grittner"
Date:
>>> "Robert Haas" <robertmhaas@gmail.com> wrote: 
> On Mon, Dec 29, 2008 at 9:28 PM, Gregory Stark
<stark@enterprisedb.com> wrote:
>> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>>> (3)  A finer-grained approach would be to make no-effect updates
to
>>> rows to lock them if they are to be read for purposes of updating
>>> something else in the transaction.  This could have a high cost in
>>> disk access and table bloat.  It has the advantage of providing a
>>> simple technique which, if applied consistently, doesn't require
>>> knowledge of software beyond what is under development.
>>
>> "no-effect updates" would be just the same as SELECT FOR UPDATE
> 
> ...except that SELECT FOR UPDATE won't create table bloat, or as
much
> I/O... I think?
The effects are different, I think, in that there isn't a
serialization failure in some conflict cases where you would get one
with actual updates.  I found a paper on how to use updates to provide
serializable transactions in a snapshot database, and I'd have to
review closely to see how that difference affected the technique.  I
had been thinking that the WAL generation and bloat issues made the
technique pretty iffy, but if SELECT FOR UPDATE suffices in place of
most of the proposed updates, it just might be feasible.
-Kevin


Re: Documenting serializable vs snapshot isolation levels

From
"Robert Haas"
Date:
> The effects are different, I think, in that there isn't a
> serialization failure in some conflict cases where you would get one
> with actual updates.  I found a paper on how to use updates to provide
> serializable transactions in a snapshot database, and I'd have to
> review closely to see how that difference affected the technique.  I
> had been thinking that the WAL generation and bloat issues made the
> technique pretty iffy, but if SELECT FOR UPDATE suffices in place of
> most of the proposed updates, it just might be feasible.

In fact, I think SELECT FOR SHARE is enough.  That will give you
better concurrency, since it will block only updates and not
concurrent read transactions.

...Robert


Re: Documenting serializable vs snapshot isolation levels

From
"Kevin Grittner"
Date:
>>> "Robert Haas" <robertmhaas@gmail.com> wrote: 
>>  The effects are different, I think, in that there isn't a
>> serialization failure in some conflict cases where you would get
one
>> with actual updates.  I found a paper on how to use updates to
provide
>> serializable transactions in a snapshot database, and I'd have to
>> review closely to see how that difference affected the technique. 
I
>> had been thinking that the WAL generation and bloat issues made the
>> technique pretty iffy, but if SELECT FOR UPDATE suffices in place
of
>> most of the proposed updates, it just might be feasible.
> 
> In fact, I think SELECT FOR SHARE is enough.  That will give you
> better concurrency, since it will block only updates and not
> concurrent read transactions.
When I mentioned my initial intuition up-thread, it was exactly that. 
In tests, I quickly came up with two issues that make a general
solution harder to describe.  One is that (in the upstream jargon) if
TN and T1 are the same transaction, this totally fails to provide
serializable behavior.  (I can provide details for every case I
describe if people want them, but really....)  The other issue is that
neither FOR SHARE or FOR UPDATE is allowed in some contexts which
matter -- like subqueries and aggregate functions.
I'm still trying to sort it all out, but I welcome all suggestions --
it's entirely possible that I'm missing something that will simplify
the whole issue for me.
Thanks,
-Kevin


Re: Documenting serializable vs snapshot isolation levels

From
Simon Riggs
Date:
On Mon, 2008-12-29 at 18:13 -0600, Kevin Grittner wrote:

> I hope someone can show me something good I've missed so far.

You're viewing this in problem-exposed language, unintentionally I'm
sure. My viewpoint on this is that database concurrency is a big issue,
but that the way we do things round here is a major leap forward on the
way things happened previously (and still do in older-style DBMS).

Our approach to serializable queries is an optimistic one in two ways:
It covers most cases, but not all theoretical cases. It also avoids
locks by default. 

Those are good things, with many benefits. If we put the default the
other way around, developers would spend much more time re-tuning
queries that had locked each other out. So I would say we choose to
avoid locking-on-every-query with good reason. Just look at the
facilities DB2 provides to avoid it. Ugh-ly.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Documenting serializable vs snapshot isolation levels

From
"Kevin Grittner"
Date:
>>> Simon Riggs <simon@2ndQuadrant.com> wrote: 
> On Mon, 2008-12-29 at 18:13 -0600, Kevin Grittner wrote:
> 
>> I hope someone can show me something good I've missed so far.
> 
> You're viewing this in problem-exposed language, unintentionally I'm
> sure.
Hmmm....  My meaning was, "I hope someone can point me to a good paper
summarizing the nature and scope of possible anomalies, to save me
time casting about or thinking it through."  If that came of as
"there's nothing good about the current approach," I apologize.  That
was certainly not what I meant to convey.
> My viewpoint on this is that database concurrency is a big issue,
> but that the way we do things round here is a major leap forward on
the
> way things happened previously (and still do in older-style DBMS).
I absolutely agree.
> Our approach to serializable queries is an optimistic one in two
ways:
> It covers most cases, but not all theoretical cases.
Not sure about "most".  Referential integrity is a pretty common use
case, and it is not covered without explicit locking.  Many other
common use cases are not, either.  I agree many are, and that the rest
can be worked around easily enough that I wouldn't want to see
blocking introduced to the degree that non-MVCC databases use for
serializable access.
Thanks for pointing out a bad choice of words; no need for this to
become muddled over simple misunderstandings.
-Kevin


Re: Documenting serializable vs snapshot isolation levels

From
"Robert Haas"
Date:
> Not sure about "most".  Referential integrity is a pretty common use
> case, and it is not covered without explicit locking.  Many other
> common use cases are not, either.  I agree many are, and that the rest
> can be worked around easily enough that I wouldn't want to see
> blocking introduced to the degree that non-MVCC databases use for
> serializable access.

What do you mean by referential integrity?  I don't believe you can
construct a foreign key problem at any transaction isolation level.

...Robert


Re: Documenting serializable vs snapshot isolation levels

From
"Kevin Grittner"
Date:
>>> "Robert Haas" <robertmhaas@gmail.com> wrote: 
>>  Not sure about "most".  Referential integrity is a pretty common
use
>> case, and it is not covered without explicit locking.  Many other
>> common use cases are not, either.  I agree many are, and that the
rest
>> can be worked around easily enough that I wouldn't want to see
>> blocking introduced to the degree that non-MVCC databases use for
>> serializable access.
> 
> What do you mean by referential integrity?  I don't believe you can
> construct a foreign key problem at any transaction isolation level.
I mean that if someone attempts to maintain referential integrity with
SQL code, without using explicit locks, it is not reliable. 
Presumably the implementation of foreign keys in PostgreSQL takes this
into account and blocks the kind of behavior shown below.  This
behavior would not occur with true serializable transactions.
-- setup
create table parent (parid int not null primary key);
create table child (chiid int not null primary key, parid int);
insert into parent values (1);

-- connection 1 (start of T0)
start transaction isolation level serializable;
select * from parent where parid = 1;
-- parent row exists; OK to insert child.
insert into child values (100, 1);

-- connection 2 (T1)
start transaction isolation level serializable;
select * from child where parid = 1;
-- child row doesn't exist; OK to delete parent
delete from parent where parid = 1;
commit;

-- connection 1 (end of T0)
commit transaction;

-- database now lacks referential integrity
select * from parent;parid
-------
(0 rows)

select * from child;chiid | parid
-------+-------  100 |     1
(1 row)

-Kevin


Re: Documenting serializable vs snapshot isolation levels

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> What do you mean by referential integrity?  I don't believe you can
>> construct a foreign key problem at any transaction isolation level.
> I mean that if someone attempts to maintain referential integrity with
> SQL code, without using explicit locks, it is not reliable. 
> Presumably the implementation of foreign keys in PostgreSQL takes this
> into account and blocks the kind of behavior shown below.  This
> behavior would not occur with true serializable transactions.

IIRC the RI code has to fudge the normal serializable-snapshot behavior
in order to guarantee no constraint violation --- it has to be aware of
concurrent changes that would otherwise be invisible to a serializable
transaction.
        regards, tom lane


Re: Documenting serializable vs snapshot isolation levels

From
"Robert Haas"
Date:
> I mean that if someone attempts to maintain referential integrity with
> SQL code, without using explicit locks, it is not reliable.
> Presumably the implementation of foreign keys in PostgreSQL takes this
> into account and blocks the kind of behavior shown below.  This
> behavior would not occur with true serializable transactions.

I see your point now, but I don't think we're really getting anywhere
here.   Doing it this way rather than using a foreign key constraint
is dumb, and a foreign key constraint works fine - so I think Simon's
statement that our approach works in most cases is 100% accurate.
With respect to your example here, we're right back to what I said way
upthread: if you're worried about concurrent updates or deletes,
SELECT ... FOR SHARE is sufficient.  If you're worried about
concurrent inserts, as you are here (delete from parent wants to make
sure no row can be concurrently inserted into child), you need to take
a SHARE lock on the table into which you want to prevent inserts.

As you pointed out, SELECT ... FOR SHARE isn't always available; when
it isn't, you can either rewrite the query - if that's feasible - or
take a SHARE lock on the table instead.

It really seems to me that we're going around in circles here.  I
don't feel that statements like this are advancing the dialogue one
bit:

> Referential integrity is a pretty common use case, and it is not covered without explicit locking.  Many other
> common use cases are not, either.

I believe this to be plain false.  Referential integrity as I
understand it (i.e. foreign key constraints, rather than some
half-baked home grown approach) works fine without explicit locking
and without even changing the transaction isolation level.  The
assertion that there are many other common use cases that are not
covered is hand-waving unsupported by evidence.  The only problems
you've raised so far are well-known problems in database theory; I
learned about them from Jim Gray's 1993 "Transaction Processing", but
that's about a 700 page book.  I suspect there are shorter texts that
you could read to pick up the main ideas but I'm not familiar with
them so I can't provide any pointers.

On further review, I actually think that our documentation is pretty
clear about this topic, too.  Everything we've talked about thus far
all seems to be spelled out in chapter 13:

http://www.postgresql.org/docs/8.3/interactive/mvcc-intro.html
http://www.postgresql.org/docs/8.3/interactive/transaction-iso.html
http://www.postgresql.org/docs/8.3/interactive/explicit-locking.html
http://www.postgresql.org/docs/8.3/interactive/applevel-consistency.html

Note in particular section 13.2.2.1. Serializable Isolation versus
True Serializability

...Robert


Re: Documenting serializable vs snapshot isolation levels

From
Simon Riggs
Date:
On Fri, 2009-01-02 at 13:47 -0500, Tom Lane wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> >> What do you mean by referential integrity?  I don't believe you can
> >> construct a foreign key problem at any transaction isolation level.
>  
> > I mean that if someone attempts to maintain referential integrity with
> > SQL code, without using explicit locks, it is not reliable. 
> > Presumably the implementation of foreign keys in PostgreSQL takes this
> > into account and blocks the kind of behavior shown below.  This
> > behavior would not occur with true serializable transactions.
> 
> IIRC the RI code has to fudge the normal serializable-snapshot behavior
> in order to guarantee no constraint violation --- it has to be aware of
> concurrent changes that would otherwise be invisible to a serializable
> transaction.

...just to add that this is exactly as required by SQL Standard, i.e. RI
works in Read Committed mode even within a Serializable transaction.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Documenting serializable vs snapshot isolation levels

From
"Kevin Grittner"
Date:
I've rearranged the sequence of some lines in the previous post to
facilitate discussion.  I hope no offense is taken.
>>> "Robert Haas" <robertmhaas@gmail.com> wrote: 
> On further review, I actually think that our documentation is pretty
> clear about this topic, too.  Everything we've talked about thus far
> all seems to be spelled out in chapter 13:
> 
> http://www.postgresql.org/docs/8.3/interactive/mvcc-intro.html
> http://www.postgresql.org/docs/8.3/interactive/transaction-iso.html
> http://www.postgresql.org/docs/8.3/interactive/explicit-locking.html
>
http://www.postgresql.org/docs/8.3/interactive/applevel-consistency.html
> 
> Note in particular section 13.2.2.1. Serializable Isolation versus
> True Serializability
I read all of the above over very carefully, several times, before
starting this thread.  These are precisely the sections I feel could
use correction and improvement.
> Doing it this way rather than using a foreign key constraint
> is dumb, and a foreign key constraint works fine
The point is that it is something that would work reliably under
serializable isolation, but not under snapshot isolation.  I picked it
merely because it is a simple integrity test that someone might choose
to enforce in a trigger in some other database, and might not
recognize it as an unreliable technique in PostgreSQL.  Dumb or not,
they may lose integrity after moving to PostgreSQL if they miss this,
and I propose documenting the issue to assist people.
> The only problems
> you've raised so far are well-known problems in database theory; I
> learned about them from Jim Gray's 1993 "Transaction Processing",
but
> that's about a 700 page book.  I suspect there are shorter texts
that
> you could read to pick up the main ideas but I'm not familiar with
> them so I can't provide any pointers.
> With respect to your example here, we're right back to what I said
way
> upthread: if you're worried about concurrent updates or deletes,
> SELECT ... FOR SHARE is sufficient.  If you're worried about
> concurrent inserts, as you are here (delete from parent wants to
make
> sure no row can be concurrently inserted into child), you need to
take
> a SHARE lock on the table into which you want to prevent inserts.
This advice seems consistent with the current PostgreSQL documentation
(cited above) and might lead one to believe that in the example you
reference, adding a FOR SHARE to the SELECT which confirms the
existence of the parent row, and a LOCK TABLE on the child table at
the start of the transaction which does the DELETE of the parent would
provide integrity.  It does not; try it if you want confirmation.  It
does introduce blocking, but after the block clears, the result in the
database is identical to the example as originally posted.  This is
why I think the documentation could use enhancement.
> It really seems to me that we're going around in circles here.
Agreed.  I guess I contributed to that by questioning whether "most"
or "many" was a more appropriate adjective, which is pretty
irrelevant, really.  I'll try to stay focused on examples of things
that work in one environment and don't in the other, with tips to get
the desired behavior within PostgreSQL.  I have come up with many more
examples of these than I have posted on-list, but posting every single
example doesn't seem valuable to me.  I'm trying to generalize to
provide useful guidelines, but feel sure that I'm re-inventing the
wheel here.
Thanks for suggesting Jim Gray's "Transaction Processing".  I'll look
for it.  If it's framing things from a theoretical point of view,
there will be some work necessary to distill it down to the concise
and practical advice which I've found necessary to effectively guide
application programmers, but at least I can do it with more confidence
that I've covered all the relevant ground.
-Kevin