Thread: Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

From
"Kevin Grittner"
Date:
[slight rearrangement]
Florian Pflug  wrote:
> I'm very exited about the work you're doing
Always nice to hear.  :-)
> I view my proposal as pretty orthogonal to that work.
> My proposal allows for simple FK-like constraints to be
> implemented at user-level that are correct for all isolation
> levels.
OK, I can see the attraction in that.
> True serializable transaction are much more powerful than what I
> proposed, but at a much higher price too, due to the necessity of
> SIREAD locks.
I think that SIREAD locks will generally be cheaper than SELECT FOR
UPDATE, since the former don't require any disk I/O and the latter
do.  I only have one benchmark so far (more on the way), but it
attempts to isolate the cost of acquiring the SIREAD locks by using
a read-only load against a fully cached database.  Benchmarks so far
show the new version of the SERIALIZABLE level as supporting 1.8%
fewer TPS than REPEATABLE READ (the existing snapshot isolation
level) in that environment.  That will probably disappear into the
noise for any load involving disk I/O.
Now *rollbacks*, particularly those due to false positives, might
become a more serious issue in some pessimal loads, but I'm still
working on developing meaningful benchmarks for that.
I guess what I'm suggesting is that unless you have a very small
database with a very large number of connections in a high
contention workload, or you can't require SERIALIZABLE transaction
isolation level, SSI might actually perform better than what you're
proposing.  Of course, that's all conjecture until there are
benchmarks; but I'd be very interested in getting any and all
alternative solutions like this worked into a benchmark -- where I
can pull out the FOR UPDATE and FOR SHARE clauses, any redundant
updates or denormalizations added just for concurrency issues, and
all explicit locking -- and compare that under SERIALIZABLE to the
original performance.
-Kevin



Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

From
Florian Pflug
Date:
On May 14, 2010, at 12:56 , Kevin Grittner wrote:
>> True serializable transaction are much more powerful than what I
>> proposed, but at a much higher price too, due to the necessity of
>> SIREAD locks.
>
> I think that SIREAD locks will generally be cheaper than SELECT FOR
> UPDATE, since the former don't require any disk I/O and the latter
> do.  I only have one benchmark so far (more on the way), but it
> attempts to isolate the cost of acquiring the SIREAD locks by using
> a read-only load against a fully cached database.  Benchmarks so far
> show the new version of the SERIALIZABLE level as supporting 1.8%
> fewer TPS than REPEATABLE READ (the existing snapshot isolation
> level) in that environment.  That will probably disappear into the
> noise for any load involving disk I/O.

I can see how a single SIREAD lock can potentially be cheaper than a FOR SHARE or FOR UPDATE lock. But the number of
SIREADlocks would exceed the number of FOR SHARE / FOR UPDATE locks by a few order of magnitude I'd think - at least of
youran even transaction under true serializable isolation. 

I don't quite understand how SIREAD locks work if they don't involve any disk IO, since shared memory isn't resizable.
ButI guess I'll find out once you post the patch ;-) 

> I guess what I'm suggesting is that unless you have a very small
> database with a very large number of connections in a high
> contention workload, or you can't require SERIALIZABLE transaction
> isolation level, SSI might actually perform better than what you're
> proposing.

That is entirely possible. However, unless your patch completely removes support for snapshot isolation (what is
currentcalled SERIALIZABLE), my proposal still eliminates the situation that user-level constraints are correct in READ
COMMITTEDand (true) SERIALIZABLE isolation but not in snapshot isolation. 

Btw, the only user of FOR SHARE locks inside postgres proper are the RI triggers, and those do that special crosscheck
whencalled within a SERIALIZABLE transactions. I do take this as evidence that the current behavior might not be all
thatuseful with serializable transactions... 

best regards,
Florian Pflug




Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

From
"Kevin Grittner"
Date:
Florian Pflug <fgp@phlo.org> wrote:
> On May 14, 2010, at 12:56 , Kevin Grittner wrote:
>> I think that SIREAD locks will generally be cheaper than SELECT
>> FOR UPDATE, since the former don't require any disk I/O and the
>> latter do.
> I can see how a single SIREAD lock can potentially be cheaper than
> a FOR SHARE or FOR UPDATE lock. But the number of SIREAD locks
> would exceed the number of FOR SHARE / FOR UPDATE locks by a few
> order of magnitude I'd think - at least of you ran even
> transaction under true serializable isolation.
> 
> I don't quite understand how SIREAD locks work if they don't
> involve any disk IO, since shared memory isn't resizable.
We use a well-worn technique used by many (most?) database products
-- granularity promotion.  This is one of the things which could
cause enough false positives under some loads to cause your
technique to perform better than SSI for those loads.
> unless your patch completely removes support for snapshot
> isolation (what is current called SERIALIZABLE)
Both SERIALIZABLE and REPEATABLE READ currently map to snapshot
isolation.  We're leaving REPEATABLE READ alone.
> my proposal still eliminates the situation that user-level
> constraints are correct in READ COMMITTED and (true) SERIALIZABLE
> isolation but not in snapshot isolation.
Agreed.  If someone wants to enforce user-level constraints using
SSI, they will somehow need to ensure that less strict isolation
levels are never used to modify data.  Your approach lifts that
burden.
By the way, if you can make this behave in a similar way to Oracle,
especially if the syntax is compatible, I'm sure it will help
promote PostgreSQL adoption.  At PostgreSQL Conference U.S. East
2010, I talked briefly with a couple guys from an Oracle shop who
were looking at converting to PostgreSQL, and were very concerned
about not having what you describe.  The techniques required to
ensure integrity in PostgreSQL were not, to put it mildly, appealing
to them.  I suspect that they would be satisfied with *either* SSI
or the change you describe.
-Kevin


Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

From
Florian Pflug
Date:
On May 14, 2010, at 15:54 , Kevin Grittner wrote:
> Florian Pflug <fgp@phlo.org> wrote:
>> On May 14, 2010, at 12:56 , Kevin Grittner wrote:
>> unless your patch completely removes support for snapshot
>> isolation (what is current called SERIALIZABLE)
>
> Both SERIALIZABLE and REPEATABLE READ currently map to snapshot
> isolation.  We're leaving REPEATABLE READ alone.

Ah, yeah, that makes a lot of sense. I kinda had forgotten about REPEATABLE READ...

>> my proposal still eliminates the situation that user-level
>> constraints are correct in READ COMMITTED and (true) SERIALIZABLE
>> isolation but not in snapshot isolation.
>
> Agreed.  If someone wants to enforce user-level constraints using
> SSI, they will somehow need to ensure that less strict isolation
> levels are never used to modify data.  Your approach lifts that
> burden.
>
> By the way, if you can make this behave in a similar way to Oracle,
> especially if the syntax is compatible, I'm sure it will help
> promote PostgreSQL adoption.  At PostgreSQL Conference U.S. East
> 2010, I talked briefly with a couple guys from an Oracle shop who
> were looking at converting to PostgreSQL, and were very concerned
> about not having what you describe.  The techniques required to
> ensure integrity in PostgreSQL were not, to put it mildly, appealing
> to them.  I suspect that they would be satisfied with *either* SSI
> or the change you describe.

My proposal would make SELECT ... FOR UPDATE behave like Oracle does with regard to serialization conflicts. SELECT ...
FORSHARE doesn't seem to exist on Oracle at all - at least I couldn't find a reference to it in the docs. 

The syntax isn't 100% compatible because Oracle seems to expect a list of columns after the FOR UPDATE clause, while
postgresexpects a list of tables. 

I must admit that I wasn't able to find an explicit reference to Oracle's behavior in their docs, so I had to resort to
experiments.They do have examples showing how to do FK-like constraints with triggers, and those don't contain any
warningwhatsoever about problems in SERIALIZABLE mode, though. But still, if there is word on this from Oracle
somewhere,I'd love to hear about it. 

best regards,
Florian Pflug



Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

From
"Kevin Grittner"
Date:
Florian Pflug <fgp@phlo.org> wrote:
> I must admit that I wasn't able to find an explicit reference to
> Oracle's behavior in their docs, so I had to resort to
> experiments. They do have examples showing how to do FK-like
> constraints with triggers, and those don't contain any warning
> whatsoever about problems in SERIALIZABLE mode, though.  But
> still, if there is word on this from Oracle somewhere, I'd love to
> hear about it.
I suspect that in trying to emulate Oracle on this, you may run into
an issue which posed challenges for the SSI implementation which
didn't come up in the Cahill prototype implementations: Oracle, and
all other MVCC databases I've read about outside of PostgreSQL, use
an "update in place with a rollback log" technique.  Access to any
version of a given row or index entry goes through a single
location, with possible backtracking through the log after that,
which simplifies management of certain concurrency issues.  Do they
perhaps use an in-RAM lock table, pointing to the "base" location of
the row for these SELECT FOR UPDATE locks?  (Just guessing; I've
never used Oracle, myself.)
-Kevin


Re: Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

From
Rob Wultsch
Date:
On Fri, May 14, 2010 at 7:32 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Oracle, and all other MVCC databases I've read about outside of PostgreSQL, use
> an "update in place with a rollback log" technique.

Have you looked at PBXT (which is explicitly NOT SERIALIZABLE)?

-- 
Rob Wultsch
wultsch@gmail.com