Thread: Hibernate / other ORM and pg_advisory_lock

Hibernate / other ORM and pg_advisory_lock

From

Craig Ringer

Date:

02 April 2008, 14:09:30

Hi all

I'm about to settle on a database implementation for the Java part of
the development project I'm working on, and I'm wondering if anybody
here has any experience using PostgreSQL's advisory locking features
with ORM database layers like Hibernate, TopLink, OpenJPA, Cayenne,
iBatis, etc. I'm hoping to avoid relying on locking using long running
transactions that're open across user interaction (ugh) and
pg_advisory_lock etc seem well suited to the task.

At this point it's looking like I'll be using Hibernate (via Hibernate
EntityManager / JPA), so I'm especially interested in comments related
to it. In particular, in a J2SE desktop environment, has anybody here
made use (successful or otherwise) of Pg's advisory locking features?
They work fine in my test cases with Hibernate EntityManager and with
OpenJPA (not with TopLink Essentials due to connection management
issues) but working in a few reasonably simple tests is very different
from working reliably in production.

I'm particularly worried about ensuring that connections aren't cached,
pooled, or terminated & reestablished behind my back - say after an
error that results in a transaction rolling back. The connection
drop/reestablish would release any advisory locks being held, and if
that's a risk then data must be selected FOR UPDATE and compared to the
old copies before making any changes. Which also involves persuading the
ORM layer to do that, and do it the right way.

[Java newbie warning] I'm also curious about strategies people have used
to ensure reliable advisory lock release when the persistence layer
finishes with the object. I'd prefer to avoid the need to explicitly
release the locks, instead  having them released when the persistence
layer "forgets" about the object. In C++ (w/o a gc at least) I might
just use a dtor, but with the unpredictability of destruction timing in
a gc'd environment it's not so attractive in Java. Right now I'm
inclined to define a finish() method that causes all subsequent calls to
accessors to throw, and call it from finalize in case some code path
fails to call it manually when done with the object. Totally stupid, or
workable sounding? [/Java newbie warning]

So ... any thoughts/experiences using Hibernate or other ORM  tools with
Pg's advisory locking?

--
Craig Ringer

Re: Hibernate / other ORM and pg_advisory_lock

From

Andrew

Date:

02 April 2008, 20:53:59

Hi Craig,

I have not used the pg_advisory_lock, so I can't comment on the use of
it with various tools.

With JPA you can add a version column to your table, either an integer
or timestamp datatype, with the preference being an integer field as the
timestamp column may not be granular enough to avoid data integrity
issues, and flag that column as a version field with the @Version
annotation in your corresponding entity, which will then tell JPA to use
this field for efficient optimistic locking.

I'm happy to stand corrected, but my understanding of JPA is that by
default it uses optimistic locking anyway.  But without a @Version
field, data integrity is not guaranteed as it uses the last commit wins
strategy, whereas if a @Version annotation exists, it compares the
version value, and if it does not exist on the tuple in the database for
the current object you hold in memory, JPA will throw an
OptimisticLockException.  Without the @Version annotation, you run the
risk of blowing away someone else's entries when you do your update with
older data.  If you are happy with just a dirty read, and do not need to
get an optimistic lock, then you can use the
@OptimisticLock(excluded=true) annotation, which is particularly useful
on relationships that you will not be updating at the same time as the
current entity, such as for reference values.

To set pessimistic locking, you need to go to the Entity Manager and
place the lock there, with the different JPA vendors all having
different ways of doing so, which doesn't make using pessimistic locking
in JPA very portable.  Using a pessimistic locking strategy such as the
use of pg_advisory_lock ties you to the database vendor, again not a
very portable approach, which may still be okay for you.  Additionally,
any pessimistic locking will limit the scalability of your solution.  It
may be the best solution for your particular domain model, but in most
cases, pessimistic locking is generally deemed an evil.  I would be very
hesitant in going down the path of using pg_advisory_locks as from what
I understand, they are tied to the session life, not to a transaction.
So the lock will persist even after a transaction is committed or rolled
back.  Which means that the onus is on you to call pg_advisory_unlock
successfully to remove the lock at the appropriate time and you need to
manage the uniqueness of the lock handle across multiple clients.  I can
see some really useful things to do with this function, but I wouldn't
be using it for plain CRUD work.  That is what the ORM layer is there for.

I'm not quite sure I follow what your concern is with the transaction.
If I understand correctly you are concerned that with a transaction
rolling back and then a subsequent update not updating correctly as it
is updating with a stale version that was cached?  If you use the
@Version annotation, this will resolve that issue, while still providing
you with full ACID behaviour, but in a highly concurrent environment,
you may need to handle a number of OptimisticLockException's in the
client, which is usually better than handling longer transaction times
per client.  Using either Optimistic or Pessimistic locking though will
not stop you from getting race conditions and deadlocks if you have not
designed the application well.  But the again, even the best developers
run into them from time to time.  Additionally, with an ORM such as JPA,
the entities you use to populate your client UI are created outside of a
transaction - at least created by default with a read lock which is
dropped once the entities are created.  It is only when you need to add
a new record, or update an existing one, that you enlist a transaction
for that particular event.  So transactions in an Optimistic locking
model should have very short life spans.  If you have a long running
transaction using pessimistic locking while waiting for user
interaction, then you probably need to reconsider your design.

With an ORM such as JPA, the EM should clean up any locks on completion
of the transaction, though you should ideally declare the completion of
the transaction either with a commit or a rollback.  The typical pattern
is to wrap the DB call in a try catch block, and commit at the end of
the try, rollback in the catch, and close your JPA/JDBC objects in the
finally block - which are themselves wrapped in a try block depending on
how defensive your coding style is.  However, some web frameworks such
as Seam take care of the whole transaction cycle by default without a
need to call any transaction demarcation points, and you need to
override the behaviour if you want more atomic control.

In Java, unlike the C destroy(), the finalize() method is
non-deterministic, in that it only gets run when the GC cleans up the
object, which may be never.  It is not a good idea to depend on it to
clean up your resources.  Additionally, the finalize method is only ever
called once on an object, so if the object is GC'd and the finalize
method is called and in the finalize method it resurrects itself,
perhaps by passing a reference of itself to another object, the finalize
method will not be called again for that particular object when it comes
time again to dispose of it.  Generally, you avoid the finalize method,
as there are not many examples that you would depend on it for, other
than for a last ditch best effort attempt to clean up any resources you
may not have otherwise been able to.  It is another coding style that is
generally deemed an evil.

Hopefully I have answered your question, and if not, at least pointed
you in the right direction.

Cheers,

Andy

Craig Ringer wrote:
> Hi all
>
> I'm about to settle on a database implementation for the Java part of
> the development project I'm working on, and I'm wondering if anybody
> here has any experience using PostgreSQL's advisory locking features
> with ORM database layers like Hibernate, TopLink, OpenJPA, Cayenne,
> iBatis, etc. I'm hoping to avoid relying on locking using long running
> transactions that're open across user interaction (ugh) and
> pg_advisory_lock etc seem well suited to the task.
>
> At this point it's looking like I'll be using Hibernate (via Hibernate
> EntityManager / JPA), so I'm especially interested in comments related
> to it. In particular, in a J2SE desktop environment, has anybody here
> made use (successful or otherwise) of Pg's advisory locking features?
> They work fine in my test cases with Hibernate EntityManager and with
> OpenJPA (not with TopLink Essentials due to connection management
> issues) but working in a few reasonably simple tests is very different
> from working reliably in production.
>
> I'm particularly worried about ensuring that connections aren't
> cached, pooled, or terminated & reestablished behind my back - say
> after an error that results in a transaction rolling back. The
> connection drop/reestablish would release any advisory locks being
> held, and if that's a risk then data must be selected FOR UPDATE and
> compared to the old copies before making any changes. Which also
> involves persuading the ORM layer to do that, and do it the right way.
>
> [Java newbie warning] I'm also curious about strategies people have
> used to ensure reliable advisory lock release when the persistence
> layer finishes with the object. I'd prefer to avoid the need to
> explicitly release the locks, instead  having them released when the
> persistence layer "forgets" about the object. In C++ (w/o a gc at
> least) I might just use a dtor, but with the unpredictability of
> destruction timing in a gc'd environment it's not so attractive in
> Java. Right now I'm inclined to define a finish() method that causes
> all subsequent calls to accessors to throw, and call it from finalize
> in case some code path fails to call it manually when done with the
> object. Totally stupid, or workable sounding? [/Java newbie warning]
>
> So ... any thoughts/experiences using Hibernate or other ORM  tools
> with Pg's advisory locking?
>
> --
> Craig Ringer
>

Re: Hibernate / other ORM and pg_advisory_lock

From

Craig Ringer

Date:

03 April 2008, 00:17:24

Thanks for your interesting and clear response. My comments are
interleaved below.

Andrew wrote:
> With JPA you can add a version column to your table, either an integer
> or timestamp datatype, with the preference being an integer field as
> the timestamp column may not be granular enough to avoid data
> integrity issues, and flag that column as a version field with the
> @Version annotation in your corresponding entity, which will then tell
> JPA to use this field for efficient optimistic locking.
At least in a desktop app it's sometimes desirable to inform a user that
(for example) somebody else is currently modifying a customer record, so
they can't edit it right now, then possibly offer to wait. In this
environment that's significantly preferable in user interface terms to
telling the user "Whoops, can't save your changes, somebody else beat
you to it", reloading the data in the UI and getting them to make their
changes again. It's not something I can do with optimistic locking, and
doing it with JPA pessimistic locking a transaction must be held open
during user think-time. Or lunch-time, for that matter, though I'd time
out any exclusive access after a while. Such long running transactions
are not an attractive option.

The locking to protect against conflicting concurrent updates in a
transaction context is required even if advisory locking is used; the
advisory locking is just useful to minimise the risk of conflicts in
cases where the conflict can't be resolved without bugging the user. The
app should still work and shouldn't mangle data even if advisory locking
attempts were replaced with a call that always succeeded immediately,
though it'd have to force users to retry/reenter some things.
>
> I'm happy to stand corrected, but my understanding of JPA is that by
> default it uses optimistic locking anyway.  But without a @Version
> field, data integrity is not guaranteed as it uses the last commit
> wins strategy [..] To set pessimistic locking, you need to go to the
> Entity Manager and place the lock there, with the different JPA
> vendors all having different ways of doing so, which doesn't make
> using pessimistic locking in JPA very portable.
Yep, that was my understanding. I was really hoping there'd also be the
option to use traditional SELECT ... FOR UPDATE locking because I have
existing apps using the database that handle trickier updates that way,
but it doesn't look like that's covered within the JPA spec.

The spec defines a lock(....) method on the EntityManager:


http://java.sun.com/javaee/5/docs/api/javax/persistence/EntityManager.html#lock(java.lang.Object,%20javax.persistence.LockModeType)

and defines the LockModeType enumeration, though it assumes that locking
will be done using a version column rather than using SELECT ... FOR UPDATE.

Other apps use this database, and it's not at all attractive to be
forced to use the (to my mind clunky, though efficient under high loads)
JPA version column approach for update collision detection. All the
other DB-using code would also have to be altered to increment the
version column, and all the rest already uses normal database
concurrency control like using safe updates (UPDATE blah SET thing =
thing + 1) or SELECT ... FOR UPDATE locking to make sure everything goes
smoothly.

I might be able to do something hairy like a trigger that increments the
version column on UPDATE only if it's not explicitly updated by the app,
but I'd prefer to get the Java app to fit in with the usual locking and
concurrency control scheme. This is the sort of thing I was worried
about with the use of ORM tools.
>   Using a pessimistic locking strategy such as the use of
> pg_advisory_lock ties you to the database vendor, again not a very
> portable approach, which may still be okay for you.
It is OK for the workload of this app, but certainly an issue for some
apps. This app is already highly tied to Pg, and to a fair extent is
just a user interface for the brains of the system that live in Pg.
> Additionally, any pessimistic locking will limit the scalability of
> your solution.
Again, no big deal in the cases its intended for. Areas where I'd be
using advisory locking are generally low concurrency, and the advisory
locking is intended to improve "user experience" rather than as a data
protection strategy. The app will still use appropriate locking during
the transaction used to actually commit the changes; the advisory lock
is to ensure there's no chance of another user trying to then modify the
same data and having the change rejected.
> It may be the best solution for your particular domain model, but in
> most cases, pessimistic locking is generally deemed an evil.  I would
> be very hesitant in going down the path of using pg_advisory_locks as
> from what I understand, they are tied to the session life, not to a
> transaction.  So the lock will persist even after a transaction is
> committed or rolled back.  Which means that the onus is on you to call
> pg_advisory_unlock successfully to remove the lock at the appropriate
> time and you need to manage the uniqueness of the lock handle across
> multiple clients.
Yep, those are exactly the characteristics that make advisory locking
attractive. A lock can be taken to inform other apps that "the user
intends to modify this <thing>" where <thing> might be a particular
record, collection of records with a certain common characteristic, etc.
Other apps can test the lock and use that information for things like
warning users "this <thing> is being modified by someone else", and can
also where it's useful wait on the lock. No long-running transactions
with locks need to be held open. They're also neatly cleaned up on
disconnect so there are fewer issues of dangling locks of an app
crashes, loses its network connection, etc.

To my mind advisory locking is entirely separate to, and does not
replace, proper locking or versioning during transactions, and the goal
is to minimize conflicts where those conflicts would result in user
annoyance or confusion.

The alternative is to use an appserver for inter-client messaging to
essentially the same effect, or to use a database table for non-waitable
advisory locking (and lots of MVCC bloat).
>
> I'm not quite sure I follow what your concern is with the
> transaction.  If I understand correctly you are concerned that with a
> transaction rolling back and then a subsequent update not updating
> correctly as it is updating with a stale version that was cached?  If
> you use the @Version annotation, this will resolve that issue, while
> still providing you with full ACID behaviour, but in a highly
> concurrent environment, you may need to handle a number of
> OptimisticLockException's in the client, which is usually better than
> handling longer transaction times per client.  Using either Optimistic
> or Pessimistic locking though will not stop you from getting race
> conditions and deadlocks if you have not designed the application
> well.  But the again, even the best developers run into them from time
> to time.  Additionally, with an ORM such as JPA, the entities you use
> to populate your client UI are created outside of a transaction - at
> least created by default with a read lock which is dropped once the
> entities are created.  It is only when you need to add a new record,
> or update an existing one, that you enlist a transaction for that
> particular event.  So transactions in an Optimistic locking model
> should have very short life spans.  If you have a long running
> transaction using pessimistic locking while waiting for user
> interaction, then you probably need to reconsider your design.
The above matches my understanding of the situation. I also agree with
regards to avoiding long running transactions, and I'm seeking to make
sure that I can safely use Pg's advisory locking specifically to avoid
having to either use a long running transaction or tell a user something
like "Your update could not be completed because somebody else has
changed this record. The copy of the record on screen has been updated.
Please check it, and if necessary re-apply any changes" because of a
conflicting update detected by @Version style optimistic locking.

>
> In Java, unlike the C destroy(), the finalize() method is
> non-deterministic, in that it only gets run when the GC cleans up the
> object, which may be never.  It is not a good idea to depend on it to
> clean up your resources.  Additionally, the finalize method is only
> ever called once on an object, so if the object is GC'd and the
> finalize method is called and in the finalize method it resurrects
> itself, perhaps by passing a reference of itself to another object,
> the finalize method will not be called again for that particular
> object when it comes time again to dispose of it.  Generally, you
> avoid the finalize method, as there are not many examples that you
> would depend on it for, other than for a last ditch best effort
> attempt to clean up any resources you may not have otherwise been able
> to.  It is another coding style that is generally deemed an evil.
That much I figured out. [Insert pathetic wailing about about lack of
RAII and lexically scoped object lifetimes here]. It pretty much
confirms what I was thinking - use an explicit close/cleanup that
releases any external resources (like pg advisory locks). Then, in
finalize(), check to make sure the object has been closed and yell
loudly in the log if not before doing the cleanup, so that any missed
closes can be found and fixed.

--
Craig Ringer

Re: Hibernate / other ORM and pg_advisory_lock

From

Andrew

Date:

03 April 2008, 04:07:01

I must admit that I wasn't even aware of the JPA lock, and it does
appears to be very restricted. Shows how much attention I've paid to
the API. When I have had to use locking through Hibernate, I have
usually even bypassed Hibernate and passed the isolation level lock hint
directly to the DB within the query as then I have finer control, but I
have so far usually had the "luxury" of working on applications that are
and forever will be tied to a particular DB, probably even fossilised to
a particular version. But thanks for the link.

You have to consider when using an ORM, why you would choose to do so,
which is mainly to do much of the boilerplate coding for you. I don't
buy into the argument that it gives you DB portability, as so far, when
using an ORM I have found myself more tightly bound to a DB dialect than
if I'd gone through JDBC directly. Using an ORM comes at a cost, and
I'm not just talking about performance overheads, as it attempts to map
object trees to relational data constructs and there is a pronounced
impedance mismatch between the two. If you start out with a fresh DB
and can design it with an ORM client in mind, single field surrogate
keys on all tables, version fields on tables likely to have multiple
users, choose data types that are only supported by the ORM and client
language, don't mind feeling compromised when doing the DB design, then
it fits beautifully. However, and this is where I begin to speak some
heresy... If you access the DB through SP's or functions, don't even
bother considering an ORM. Treat the DB as a service API. If you have
an existing DB that is not conducive to modifying, that contains mostly
natural keys, many of which are composite, then you can still use an
ORM, but you will be wrestling with it all the way. If the DB tables
have datatypes that are not standard in the JDBC or ORM, then you can
make it work, but again you are potentially faced with additional effort
above and beyond. Mileage will vary from project to project, but ORM's
are no panacea, they simply serve a purpose for some situations, and for
others you need to weigh up whether the cost is worth the returns.

So back to your problem and off my soap box, perhaps an ORM is not
necessarily the best fit for your solution?

Regardless of what locking mechanism you choose, optimistic locking or
pseudo-locking, at some point one user will be faced with stale data and
you still have to reconcile that data for the user. Any attempt to
check if the data is changed by someone else before any commit or to
check if another user also has the same set of data with the potential
to modify without a transactional lock would require some form of
sentinel pattern and polling of that sentinel, whether that sentinel is
the pg_advisory_lock, a flag in a dedicated table, or whatever else you
choose. And you are still faced it how to handle data that has changed
for the current user. Is there really a business case to set up a
polling of your DB by all of your DB clients, just for some slight
usability gains? Rhetorical question. For me to entertain such an
approach, I would personally want a pretty strong case for it.

Considering your enthusiasm of RAII, I bet you will be pleased with the
planned release of C++0x with the inclusion of smart pointers.
Completely off topic and at the risk of being chastised in this forum,
you may be interested in this interview with Bjarne Stroustoup:
http://www.ddj.com/cpp/207000124;jsessionid=4D4OJVJ3EPMOCQSNDLPSKH0CJUNN2JVN?_requestid=437406
Sadly, according to Stroustoup, C++0x will most likely be C++10, not
that I have used a ~ for many years.

Cheers,

Andy

Re: Hibernate / other ORM and pg_advisory_lock

From

Craig Ringer

Date:

06 April 2008, 17:10:58

OK, I think I now see how to get the ORM layer's version-based
optimistic locking to play well with existing apps using the DB, and how
to effectively use Pg's advisory locking. Here's what I'm thinking in
case it's ever of use to anybody:

- Add version columns for ORM tools to use, setting defaults for apps
unaware of the versioning so version columns get sensible initial values.

- Add a trigger that, on updates to tables with version columns, does:

   IF new.version = old.version AND new <> old THEN
      new.version = new.version + 1;
   END IF;

   ... to support other apps that don't know about the versioning and
rely on transaction-level locking. The new <> old check is added in case
an ORM tool assumes that if it UPDATEs a row without incrementing the
version or changing anything (say, to use the side effect that the
UPDATE obtains a lock on the row) then the row version won't change.
Without that an ORM tool might try to update the row later and think
it's conflicting with another update, even though the conflicting update
was its own.

- Where the UI can be significantly improved by doing so use advisory
locking to suppliment (not replace) the existing handling of conflicting
updates, minimising the need for users to re-enter changes. Assume
advisory locking is a best effort service that may fail (either due to
my bad coding resulting in a lock not being taken, another app not
knowing about the lock, the ORM layer switching connections behind my
back, etc) and always be prepared to handle update conflicts.

Does that sound sane to folks here? In particular, does the behaviour of
the version column trigger sound sensible?

More comments interleaved below.

Andrew wrote:

>  Using an ORM comes at a cost, and
> I'm not just talking about performance overheads, as it attempts to map
> object trees to relational data constructs and there is a pronounced
> impedance mismatch between the two.  If you start out with a fresh DB
> and can design it with an ORM client in mind, single field surrogate
> keys on all tables, version fields on tables likely to have multiple
> users, choose data types that are only supported by the ORM and client
> language, don't mind feeling compromised when doing the DB design, then
> it fits beautifully.

I'm lucky enough that the DB design is already a reasonable fit for ORM
use. The only real sticking point is the need for version columns
because ORM systems want to do cross-transaction locking/versioning .
I'd probably land up with them eventually anyway.

Everything has a surrogate key, because almost all natural keys have
issues in this DB. Everything's highly normalised. Most types are mapped
to native Java types automatically by the PostgreSQL JDBC, and the few
that aren't are just domains that're easily handled by the ORM once it's
told what the base type is.

> However, and this is where I begin to speak some
> heresy...   If you access the DB through SP's or functions, don't even
> bother considering an ORM.  Treat the DB as a service API.

Sounds sensible to me. The ORM layers seem to have a limited facility
for SPs that's good enough for in-DB queries that return
normaltable%rowtype, but beyond that it doesn't look like much fun. I'd
  want to use something like Spring-JDBC instead.

> So back to your problem and off my soap box, perhaps an ORM is not
> necessarily the best fit for your solution?

For most of the job an ORM is actually a very good fit, at least
conceptually. The amount of repetitive code required will be
significantly reduced by the ability to  map a DB record to a Java
object, then bind members/properties of the object to GUI elements in a
form. After all, significant parts of this app aren't much more
sophisticated than what you can do with MS access (if you don't mind the
licensing fees, weird quirks, and total lack of portability).

The tricky bits will probably bypass the ORM and do direct SQL, but
that'll be touching largely separate parts of the DB so there won't be
too many ORM cache issues.

> Regardless of what locking mechanism you choose, optimistic locking or
> pseudo-locking, at some point one user will be faced with stale data and
> you still have to reconcile that data for the user.  Any attempt to
> check if the data is changed by someone else before any commit or to
> check if another user also has the same set of data with the potential
> to modify without a transactional lock would require some form of
> sentinel pattern and polling of that sentinel, whether that sentinel is
> the pg_advisory_lock, a flag in a dedicated table, or whatever else you
> choose.  And you are still faced it how to handle data that has changed
> for the current user.  Is there really a business case to set up a
> polling of your DB by all of your DB clients, just for some slight
> usability gains?

First: thankfully polling is not required. One of the nice things about
pg's advisory locks is the ability for the client to block on a lock. If
the app wants to wait on an advisory lock all it has to do is attempt to
acquire the lock; the thread will block until a result is returned (ie
the lock is obtained).

If the user gets bored of waiting the UI thread can just signal the
blocked DB thread to abort the query and get on with something else.

Of course, even with advisory locking it's always *possible* that
someone else might sneak in an update. They might not check the lock. It
might be another app that doesn't know about the advisory locks. Etc. So
it's always necessary to use opportunistic or pessimistic transaction
level locking/versioning as well, to ensure there are no update
collisions, and it's necessary to be able to handle failed updates.

However, I DO consider it *well* worth the small cost to improve the
user "experience" by minimising the risk of failed updates in areas
where it must directly affect the user.

> Considering your enthusiasm of RAII, I bet you will be pleased with the
> planned release of C++0x with the inclusion of smart pointers.

For a moment I thought you were going to say "In the next version of
Java, there will be ..."

*sob*

I'm already using standard c++ shared_ptr through std::tr1 . It's
trivial to support std::tr1::shared_ptr for modern compilers and fall
back to boost::shared_ptr for older compilers. However I often prefer to
just follow strict ownership rules, use std::auto_ptr, etc.

--
Craig Ringer

Re: Hibernate / other ORM and pg_advisory_lock

From

"Peter Kovacs"

Date:

06 April 2008, 18:38:01

> -First: thankfully polling is not required. One of the nice things about pg's advisory locks is
> the ability for the client to block on a lock. If the app wants to wait on an advisory lock all it
> has to do is attempt to acquire the lock; the thread will block until a result is returned (ie the
> lock is obtained).

> If the user gets bored of waiting the UI thread can just signal the blocked DB thread to abort
> the query and get on with something else.

Is this mechanism supported by the PostgreSQL JDBC driver?

Thanks
Peter


On Sun, Apr 6, 2008 at 10:10 PM, Craig Ringer
<craig@postnewspapers.com.au> wrote:
> OK, I think I now see how to get the ORM layer's version-based optimistic
> locking to play well with existing apps using the DB, and how to effectively
> use Pg's advisory locking. Here's what I'm thinking in case it's ever of use
> to anybody:
>
>  - Add version columns for ORM tools to use, setting defaults for apps
> unaware of the versioning so version columns get sensible initial values.
>
>  - Add a trigger that, on updates to tables with version columns, does:
>
>   IF new.version = old.version AND new <> old THEN
>      new.version = new.version + 1;
>   END IF;
>
>   ... to support other apps that don't know about the versioning and rely on
> transaction-level locking. The new <> old check is added in case an ORM tool
> assumes that if it UPDATEs a row without incrementing the version or
> changing anything (say, to use the side effect that the UPDATE obtains a
> lock on the row) then the row version won't change. Without that an ORM tool
> might try to update the row later and think it's conflicting with another
> update, even though the conflicting update was its own.
>
>  - Where the UI can be significantly improved by doing so use advisory
> locking to suppliment (not replace) the existing handling of conflicting
> updates, minimising the need for users to re-enter changes. Assume advisory
> locking is a best effort service that may fail (either due to my bad coding
> resulting in a lock not being taken, another app not knowing about the lock,
> the ORM layer switching connections behind my back, etc) and always be
> prepared to handle update conflicts.
>
>  Does that sound sane to folks here? In particular, does the behaviour of
> the version column trigger sound sensible?
>
>  More comments interleaved below.
>
>
>  Andrew wrote:
>
>
> >  Using an ORM comes at a cost, and I'm not just talking about performance
> overheads, as it attempts to map object trees to relational data constructs
> and there is a pronounced impedance mismatch between the two.  If you start
> out with a fresh DB and can design it with an ORM client in mind, single
> field surrogate keys on all tables, version fields on tables likely to have
> multiple users, choose data types that are only supported by the ORM and
> client language, don't mind feeling compromised when doing the DB design,
> then it fits beautifully.
> >
>
>  I'm lucky enough that the DB design is already a reasonable fit for ORM
> use. The only real sticking point is the need for version columns because
> ORM systems want to do cross-transaction locking/versioning . I'd probably
> land up with them eventually anyway.
>
>  Everything has a surrogate key, because almost all natural keys have issues
> in this DB. Everything's highly normalised. Most types are mapped to native
> Java types automatically by the PostgreSQL JDBC, and the few that aren't are
> just domains that're easily handled by the ORM once it's told what the base
> type is.
>
>
>
> > However, and this is where I begin to speak some heresy...   If you access
> the DB through SP's or functions, don't even bother considering an ORM.
> Treat the DB as a service API.
> >
>
>  Sounds sensible to me. The ORM layers seem to have a limited facility for
> SPs that's good enough for in-DB queries that return normaltable%rowtype,
> but beyond that it doesn't look like much fun. I'd  want to use something
> like Spring-JDBC instead.
>
>
>
> > So back to your problem and off my soap box, perhaps an ORM is not
> necessarily the best fit for your solution?
> >
>
>  For most of the job an ORM is actually a very good fit, at least
> conceptually. The amount of repetitive code required will be significantly
> reduced by the ability to  map a DB record to a Java object, then bind
> members/properties of the object to GUI elements in a form. After all,
> significant parts of this app aren't much more sophisticated than what you
> can do with MS access (if you don't mind the licensing fees, weird quirks,
> and total lack of portability).
>
>  The tricky bits will probably bypass the ORM and do direct SQL, but that'll
> be touching largely separate parts of the DB so there won't be too many ORM
> cache issues.
>
>
>
> > Regardless of what locking mechanism you choose, optimistic locking or
> pseudo-locking, at some point one user will be faced with stale data and you
> still have to reconcile that data for the user.  Any attempt to check if the
> data is changed by someone else before any commit or to check if another
> user also has the same set of data with the potential to modify without a
> transactional lock would require some form of sentinel pattern and polling
> of that sentinel, whether that sentinel is the pg_advisory_lock, a flag in a
> dedicated table, or whatever else you choose.  And you are still faced it
> how to handle data that has changed for the current user.  Is there really a
> business case to set up a polling of your DB by all of your DB clients, just
> for some slight usability gains?
> >
>
>  First: thankfully polling is not required. One of the nice things about
> pg's advisory locks is the ability for the client to block on a lock. If the
> app wants to wait on an advisory lock all it has to do is attempt to acquire
> the lock; the thread will block until a result is returned (ie the lock is
> obtained).
>
>  If the user gets bored of waiting the UI thread can just signal the blocked
> DB thread to abort the query and get on with something else.
>
>  Of course, even with advisory locking it's always *possible* that someone
> else might sneak in an update. They might not check the lock. It might be
> another app that doesn't know about the advisory locks. Etc. So it's always
> necessary to use opportunistic or pessimistic transaction level
> locking/versioning as well, to ensure there are no update collisions, and
> it's necessary to be able to handle failed updates.
>
>  However, I DO consider it *well* worth the small cost to improve the user
> "experience" by minimising the risk of failed updates in areas where it must
> directly affect the user.
>
>
>
> > Considering your enthusiasm of RAII, I bet you will be pleased with the
> planned release of C++0x with the inclusion of smart pointers.
> >
>
>  For a moment I thought you were going to say "In the next version of Java,
> there will be ..."
>
>  *sob*
>
>  I'm already using standard c++ shared_ptr through std::tr1 . It's trivial
> to support std::tr1::shared_ptr for modern compilers and fall back to
> boost::shared_ptr for older compilers. However I often prefer to just follow
> strict ownership rules, use std::auto_ptr, etc.
>
>  --
>  Craig Ringer
>
>
>
>  --
>  Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
>  To make changes to your subscription:
>  http://www.postgresql.org/mailpref/pgsql-jdbc
>

Re: Hibernate / other ORM and pg_advisory_lock

From

Andrew

Date:

07 April 2008, 03:25:11

See responses below:
> - Add a trigger that, on updates to tables with version columns, does:
>
>   IF new.version = old.version AND new <> old THEN
>      new.version = new.version + 1;
>   END IF;
>
>   ... to support other apps that don't know about the versioning and
> rely on transaction-level locking. The new <> old check is added in
> case an ORM tool assumes that if it UPDATEs a row without incrementing
> the version or changing anything (say, to use the side effect that the
> UPDATE obtains a lock on the row) then the row version won't change.
> Without that an ORM tool might try to update the row later and think
> it's conflicting with another update, even though the conflicting
> update was its own.
I've seen a number of blogs suggesting similar trigger based approaches
so it will possibly work, I've just never gone down that path.  You may
want to also check Hibernates documentation, which discourages changes
to the version number outside of Hibernate:

http://www.hibernate.org/hib_docs/annotations/reference/en/html_single/#entity-mapping-entity-version

"The application must not alter the version number set up by Hibernate
in any way. To artificially increase the version number, check in
Hibernate Entity Manager's reference documentation LockMode.WRITE"

http://www.hibernate.org/hib_docs/entitymanager/reference/en/html_single/#d0e1162

"LockMode.WRITE prevents dirty-reads and non repeatable read on a given
entity and force an increase of the version number if any."

> First: thankfully polling is not required. One of the nice things
> about pg's advisory locks is the ability for the client to block on a
> lock. If the app wants to wait on an advisory lock all it has to do is
> attempt to acquire the lock; the thread will block until a result is
> returned (ie the lock is obtained).
>
> If the user gets bored of waiting the UI thread can just signal the
> blocked DB thread to abort the query and get on with something else.
>
> Of course, even with advisory locking it's always *possible* that
> someone else might sneak in an update. They might not check the lock.
> It might be another app that doesn't know about the advisory locks.
> Etc. So it's always necessary to use opportunistic or pessimistic
> transaction level locking/versioning as well, to ensure there are no
> update collisions, and it's necessary to be able to handle failed
> updates.
So back to pessimistic locking which I believe we both had agreed was
not a desirable behaviour, particularly in a user interaction model?  I
apologise but I feel that I am not following your logic, so you may have
a valid approach, but I'm missing something and so to me it does not
sound appropriate.  I guess if you have multiple clients in multiple
technologies directly accessing the DB, then your situation is
immediately a lot more complex, and pessimistic locking may be your only
solution to ensure ACID transactions.  The ideal situation would be if
the DB was only accessed through a single shared access layer by all
clients using a multi-tiered decoupled design, though I acknowledge that
is not always possible, particularly with many thick client
architectures that are tightly coupled two tiered designs at best.

I can understand how you would release the advisory lock by running
another SQL statement in another thread to do so, but I do not know if
that is of much use to you.  As far as I know you can't tell a running
JDBC query to abort or cancel a running query.  Such calls, either at
the JDBC level or at the JPA level are blocking calls for the client and
about the only influence you have on it is by setting the query time out
before executing (which has its own set of issues in the postgresql
driver
(http://www.nabble.com/Implementing-setQueryTimeout()-td15537669.html)).
Of course the UI client can always manage calls to its access layer via
another thread and kill that thread if the end user becomes impatient
and not want to wait.  But from my understanding, that query with the
back end database process will still be running and the DB connection
will still be in use, including all of the associated locks and DB
threads.  So a subsequent attempt on a long running query will also
block.  However it the queries are all short running but are part of a
long running transaction, then you can rollback the transaction at any
point, but any other calls in a separate transaction dependent on those
locks held by the running transaction will result in the aforementioned
blocked call.  You have to remember that at the ORM level (which is just
a wrapper to the JDBC), or at the JDBC level, you do not have fine grain
control of what is happening in the back end, and in my mind you should
be looking at what JDBC provides, not what the underlying DB may be able
to provide, and in using pg advisory locks, you are mixing the two.

And obviously I'm missing the main point of your argument, as I still do
not know how using pg advisory locks mixed with transaction locks will
improve your end users experience.  Also I think I'm getting a little
confused, as previously you had stated:
> At least in a desktop app it's sometimes desirable to inform a user
> that (for example) somebody else is currently modifying a customer
> record, so they can't edit it right now, then possibly offer to wait.
> In this environment that's significantly preferable in user interface
> terms to telling the user "Whoops, can't save your changes, somebody
> else beat you to it", reloading the data in the UI and getting them to
> make their changes again.
and then in the last mail the reason was:
> However, I DO consider it *well* worth the small cost to improve the
> user "experience" by minimising the risk of failed updates in areas
> where it must directly affect the user.
For me, your initial rational can only be addressed with some form of
polling on some lock or pseudo lock, whether automated or manual,
particularly if you give the user the option to wait.  The second
rational is simply around ACID transactions which a well designed
architecture and optimistic locking with versioning can provide, or
worst case a pessimistic locking strategy at the cost of scalability,
but either way without a need for the pg advisory locks.  I'm sure you
have valid scenarios in mind that justify your desired approach,
otherwise you would not feel so strongly, but I am having some trouble
working out such a scenario.  So I think that may be my failing, and is
probably not important anyway from the perspective of responding to
specific technical aspects of your question.

Cheers,

Andy

Re: Hibernate / other ORM and pg_advisory_lock

From

Craig Ringer

Date:

07 April 2008, 03:30:11

Peter Kovacs wrote:
>> -First: thankfully polling is not required. One of the nice things about pg's advisory locks is
>> the ability for the client to block on a lock. If the app wants to wait on an advisory lock all it
>> has to do is attempt to acquire the lock; the thread will block until a result is returned (ie the
>> lock is obtained).
>
>> If the user gets bored of waiting the UI thread can just signal the blocked DB thread to abort
>> the query and get on with something else.
>
> Is this mechanism supported by the PostgreSQL JDBC driver?

Good question. The answer is yes, as shown by the attached test class,
but it's definitely a good idea to test rather than assume these things.
The test should be run while some other job, like a psql interactive
session, holds pg_advisory_lock(1). You'd need to update the connection
string to use it.

Canceling the query works fine, at least on 8.3 with the JDBC4 8.3
driver. The JDBC driver handles the cancel() call by establishing a new
connection and issuing a cancel query request with it.

Of course, I can't guarantee that the JDBC driver isn't polling
internally, but it's not hitting the database to do it so for my use I
don't much care. According to wireshark there's no traffic on port 5432
during the 10 second wait interval. I'd be surprised if it wasn't
actually blocked on the network read, though I haven't gone digging to
verify this.

--
Craig Ringer
import java.sql.SQLException;
import java.sql.CallableStatement;
import java.sql.Connection;
import java.sql.DriverManager;

public class JdbcCancelTest {

    Connection c;
    CallableStatement st;

    static final String PG_ERR_QUERY_CANCELLED = "57014";

    class Waiter extends Thread {
        @Override
        public void run() {
            try {
                st = c.prepareCall("SELECT pg_advisory_lock(1);");
                System.out.println("[Waiter] Sending query...");
                st.execute();
                System.out.println("[Waiter] Query finished OK");
            } catch (SQLException ex) {
                if (ex.getSQLState().equals(PG_ERR_QUERY_CANCELLED)) {
                    System.out.println("[Waiter] Query cancelled successfully");
                } else {
                    System.out.println("[Waiter] DB error string: " + ex.getMessage() + " (SQLState: " +
ex.getSQLState()+ ")"); 
                }
            }
        }
    }

    class Killer extends Thread {
        @Override
        public void run() {
            try {
                System.out.println("[Killer] Cancelling query...");
                st.cancel();
                System.out.println("[Killer] Query cancel sent");
            } catch (SQLException ex) {
                System.out.println("[Killer] Unexpected exception: " + ex);
            }
        }
    }

    void runtest() throws InterruptedException {
        try {
            c = DriverManager.getConnection("jdbc:postgresql://localhost:5432/DBNAME?user=USERNAME&password=PASSWORD");
            Waiter w = new Waiter();
            Killer k = new Killer();
            w.start();
            Thread.sleep(10000);
            k.start();
        } catch ( SQLException ex ) {
            System.out.println(ex);
        }
    }

    public static void main(String[] args) throws ClassNotFoundException, InterruptedException {
        Class.forName("org.postgresql.Driver");
        JdbcCancelTest t = new JdbcCancelTest();
        t.runtest();
    }
}

Re: Hibernate / other ORM and pg_advisory_lock

From

Andrew

Date:

07 April 2008, 03:48:47

I'm overlooking the cancellation of the query on a separate thread with
the JDBC cancel() method.  Though that method always seems to have
issues associated with it across various drivers.  But still, for what
you are talking about, I think a long running transaction is still going
to be your biggest problem rather than a long running query.
> I can understand how you would release the advisory lock by running
> another SQL statement in another thread to do so, but I do not know if
> that is of much use to you.  As far as I know you can't tell a running
> JDBC query to abort or cancel a running query.  Such calls, either at
> the JDBC level or at the JPA level are blocking calls for the client
> and about the only influence you have on it is by setting the query
> time out before executing (which has its own set of issues in the
> postgresql driver
> (http://www.nabble.com/Implementing-setQueryTimeout()-td15537669.html)).
> Of course the UI client can always manage calls to its access layer
> via another thread and kill that thread if the end user becomes
> impatient and not want to wait.  But from my understanding, that query
> with the back end database process will still be running and the DB
> connection will still be in use, including all of the associated locks
> and DB threads.  So a subsequent attempt on a long running query will
> also block.  However it the queries are all short running but are part
> of a long running transaction, then you can rollback the transaction
> at any point, but any other calls in a separate transaction dependent
> on those locks held by the running transaction will result in the
> aforementioned blocked call.  You have to remember that at the ORM
> level (which is just a wrapper to the JDBC), or at the JDBC level, you
> do not have fine grain control of what is happening in the back end,
> and in my mind you should be looking at what JDBC provides, not what
> the underlying DB may be able to provide, and in using pg advisory
> locks, you are mixing the two.
Cheers,

Andy

Re: Hibernate / other ORM and pg_advisory_lock

From

Craig Ringer

Date:

07 April 2008, 04:54:02

I'm sorry if I've been communicating all this poorly. Even if I've been
muddling you and convincing you that I'm quite crazy, it's been a
helpful conversation for me ;-) . I hope my explanations below make a
bit more sense than the prior ones.

Andrew wrote:

>> Of course, even with advisory locking it's always *possible* that
>> someone else might sneak in an update. They might not check the lock.
>> It might be another app that doesn't know about the advisory locks.
>> Etc. So it's always necessary to use opportunistic or pessimistic
>> transaction level locking/versioning as well, to ensure there are no
>> update collisions, and it's necessary to be able to handle failed
>> updates.
>
> So back to pessimistic locking which I believe we both had agreed was
> not a desirable behaviour, particularly in a user interaction model?

The reference to pessimistic locking was a bit of a red herring. What I
meant was simply "If advisory locking is being used, it is still
necessary to rely on normal locking methods to ensure that conflicting
updates are detected or prevented".

In other words, advisory locking in this situation is used purely to
help the user out, giving them advance warning about potential
concurrent updates and letting them avoid wasting time and effort. It
does not replace normal methods for ensuring data consistency and
integrity, though it does interact with optimistic locking solely in
that it's intended to reduce or eliminate user-visible conflicts.

Advisory locking can be used to detect and avoid situations that would
otherwise result in an optimistic locking conflict requiring user action
like data reentry. It can't (or IMO shouldn't) replace optimistic locking.

An advisory lock allows information about object "busy-ness" to be
communicated among clients outside a transactional scope. This means
that conflicts can be detected and prevented before they happen, instead
of waiting until all the work is done and catching an optimistic locking
conflict.

Often we do not care about a conflict, as the operation can be re-done
behind the user's back without them knowing or caring. Incrementing some
value, for example.

In other cases, there might be time consuming or difficult-to-repeat
changes being made that the program cannot simply retry without asking
the user how to handle it. Changing a notes field on a customer record,
for example. In those cases, it's much nicer to catch the conflict
BEFORE the user does all the work. That's where advisory locking is useful.

(Yes, I know a notes/history field can just be broken down into multiple
entries in a separate table. It's an *example* because the main areas of
concern in the real system would require way too much background).

In a system designed for advanced users I'd quite possibly just let both
users proceed and tell the user who lost the race to redo their changes
and try again. I might even provide a "merge/diff" style summary of the
conflicts between their changes and the winning change. This is,
however, not a system for advanced users, and for them it's a *lot*
nicer to be told somebody else is editing something complicated and be
able to wait.

I wouldn't expose a normal user working with a content management system
to a merge conflict, source revision control system style. I don't know
what your users are like, but mine have a hard time saving email
attachments let alone interpreting change conflict summaries. I'd prefer
to tell them "This item is being edited by another user. You can try
again later, or wait for them to finish.". The situations I'm facing in
this applicatoin are similar to that problem, with similar UI implications.

Additionally, in this situation almost all updates to the areas where
advisory locking will be useful will be done through the tool I'm now
working on, which will honour the advisory locks. There will rarely, if
ever, be updates to this data from DB clients that don't know/care about
the advisory lock, so users will rarely if ever get an error due to a
conflicting update if I use advisory locking to help avoid it happening.

The advisory locking can even be done from a separate JDBC connection,
avoiding any interaction at all with the ORM tools. It's essentially the
same as using a middle layer solely for inter-client messaging, but
without the need for the middle layer. That's one less thing to break,
to my mind.

In most areas advisory locking is neither necessary nor useful. It's
only some particular situations, similar to the examples I've given,
that it's likely to be of use to improve the way multi-user interaction
works. In all other areas using optimistic locking alone should be fine.

As for mixed access by the existing apps and the new tools, I'm now
pretty sure the trigger-based versioning approach will be fine. I'm
going to do a fair bit of testing of course, but it all looks quite
reasonable. It'll just ensure that existing apps that aren't aware of
the row versioning still increment the version, and thus play well with
the ORM tools. The trigger will never change the effect of a statement
run by the ORM tools (because they always increment the version
themselves), and is invisible to them.

I'll drop a short note here when that testing is done, as I imagine it
might be of interest to others.

> The ideal situation would be if
> the DB was only accessed through a single shared access layer by all
> clients using a multi-tiered decoupled design, though I acknowledge that
> is not always possible, particularly with many thick client
> architectures that are tightly coupled two tiered designs at best.

I see a lot of this, and I'm still not sure I understand the thinking
behind it.

The DB is a shared access system. It's designed to handle concurrency,
and does a very good job of it. Adding another layer to hide the DB
seems to be adding unnecessary complexity unless it can provide
something extra like preventing / warning about concurrent changes to
objects outside the scope of DB transactions. However, that's handled
just as well by row versioning without adding another layer to the system.

Unless the additional layer implements the "business logic", using the
DB as a dumb storage engine, I just can't see much of a rationale for
its existence. The proper division of work between app and DB is of
course along standing philosophical argument that won't be resolved in a
hurry. Ditto the utility/necessity of "n-tier" application design. All I
can say is that in this specific case the DB already handles the
business logic, so the midlayer would be superfluous.

Furthermore, the additional layer will either be tied to a particular
technology (eg Java RPC) or rely on clumsy schemes like XML-RPC for
communication.

I can see the point for really big and complex systems that mix many
different data stores. For smaller stuff where the single DB is already
the center of the system and doing a good job there I don't see the
point. It strikes me as being "Enterprise-y" for the sake of it.

> I can understand how you would release the advisory lock by running
> another SQL statement in another thread to do so, but I do not know if
> that is of much use to you.

I assume you meant cancelling an attempted acquisition of the advisory
lock, rather than releasing the lock? Releasing it so it can be acquired
by another connection is just another ordinary SELECT
pg_advisory_unlock(blah). Or closing the JDBC connection, of course.

If there's an advisory lock acquisition attempt blocked, and the user
wants to give up, I can just cancel the statement trying to take out the
advisory lock using the JDBC Statement.cancel() method - see the
attached code in my other reply. That's something that's easily
triggered by user interaction in the UI thread, and involves very little
fuss.

It might not work well with some other JDBC drivers. I've seen reports
of cancel() not doing anything with MySQL's JDBC for example. For this
app that's a non-issue though, it's tied to Pg anyway.

> But from my understanding, that query with the
> back end database process will still be running and the DB connection
> will still be in use, including all of the associated locks and DB
> threads.  So a subsequent attempt on a long running query will also
> block.

Not if you use the JDBC to cancel the query. There's no guarantee that a
big, complex query will be terminated immediately by the backend, but
for something simple like an attempt to acquire an advisory lock
cancelling is immediate and quite clean, at least according to my testing.

> You have to remember that at the ORM level (which is just
> a wrapper to the JDBC), or at the JDBC level, you do not have fine grain
> control of what is happening in the back end, and in my mind you should
> be looking at what JDBC provides, not what the underlying DB may be able
> to provide, and in using pg advisory locks, you are mixing the two.

It's true that using advisory locking would involve mixing two locking
models. That could get very ugly if they had to interact in any sort of
complex way - say, if there were stored procedures or triggers that used
advisory locking and the ordinary transaction-level locking done by Pg.

To me, that says "avoid mixing advisory locking operations in with other
operations that might take normal transaction-level locks" rather than
"don't use advisory locking".

In this case the advisory locking can even be done from a separate JDBC
connection, avoiding any interaction at all with the ORM tools. Locks
can be taken out, tested, and released only before any work is done or
after all work is committed. It's essentially the same as using a middle
layer solely for inter-client messaging, but without the need for the
middle layer.

--
Craig Ringer

Re: Hibernate / other ORM and pg_advisory_lock

From

Craig Ringer

Date:

07 April 2008, 07:51:23

> I'll drop a short note here when that testing is done, as I imagine it
> might be of interest to others.

I've been unable to uncover any issues with concurrent updates being
made by both Hibernate and plain 'ol psql to the following schema:


CREATE OR REPLACE FUNCTION zz_field_version() RETURNS trigger AS $$
BEGIN
    IF tg_op = 'UPDATE'
       AND new.oplock = old.oplock
       AND ROW(new.*) IS DISTINCT FROM ROW (old.*)
    THEN
        -- Row is being updated by an application that does not know
        -- about row versioning. It's changed data in the row, but
        -- hasn't incremented the version. We'll do that for it.
        new.oplock := new.oplock + 1;
    END IF;
    RETURN new;
END;
$$ LANGUAGE 'plpgsql';

COMMENT ON FUNCTION zz_field_version() IS 'Increments the record version
if a row is changed by an update and its version was not incremented by
the UPDATE issuer. Intended to allow ORM optimistic locking to coexist
with normal DB transactional locking. Only suitable for use as a trigger
on UPDATE.';

CREATE TABLE ver (
   id SERIAL PRIMARY KEY,
   counter INTEGER NOT NULL,
   oplock INTEGER NOT NULL DEFAULT(0)
);

COMMENT ON TABLE ver IS 'Test table for versioning';

CREATE TRIGGER ver_ensure_version_updated BEFORE UPDATE ON ver FOR EACH
ROW EXECUTE PROCEDURE zz_field_version();



Two concurrent updaters that ignore the oplock column completely, like
two copies of:

#!/bin/bash
for n in `seq 1 1000`; do
    psql craig -c 'update ver set counter = counter + 1'
done


can run quite happily along with an instance or two of the Hibernate
test case. The hibernate test case, without all the tracing code and
more detailed error handling, boils down to:


// Assume pre-declared & set up EntityManager instance `em'
// and EntityTransaction tx , as well as appropriately mapped class
// Ver with an @Version annotation on the `oplock' field:
//
int ver_id = 0; /* Pre-created record with id 0, counter 0 */
int successfulUpdates = 0;
while (successfulUpdates < 1000) {
    try {
        Ver v = em.find(Ver.class, );
        v.setCounter( v.getCounter() + 1 );
        tx.commit();
        successfulUpdates ++;
    } catch (javax.persistence.RollbackException exc) {
    // Retry
    }
}


The value of `counter' for the test record in `ver' always ends up with
the total number of updates done by psql plus `successfulUpdates' * the
number of hibernate test instances. So, if there are 2 shell updaters
and 2 hibernate updaters running the counter always increases by exactly
4000 . In other words, no updates are getting lost.

If a counter is added to report retried updates in the hibernate tests
it becomes clear that there *are* conflicts being detected and retries
being performed, even if only one hibernate test is running along with
one or more psql loop tests. The trigger is doing its job.


A version that explicitly increments the version with
    em.lock(v, LockMode.WRITE)
and restarts the transaction if lock acquisition fails also works fine
when several copies are run along with several looping manual updates.


That's hardly conclusive proof that nothing can go wrong, but it seems
to be pretty solid, and I can't *see* any way it can break. Well, the
trigger needs to handle version overflow, but other than that fairly
minor issue it looks OK.

Think I'm missing anything?

Being able to use a trigger to let existing apps coexist with the ORM
row versioning will make things a *lot* easier. Doubly so if I can get
rid of the relatively expensive row comparison, but I'm not sure that's
safe with Hibernate-specific lock modes like LockMode.UPGRADE yet.

--
Craig Ringer