Re: Hibernate / other ORM and pg_advisory_lock - Mailing list pgsql-jdbc

From Craig Ringer
Subject Re: Hibernate / other ORM and pg_advisory_lock
Date
Msg-id 47F9D312.20200@postnewspapers.com.au
Whole thread Raw
In response to Re: Hibernate / other ORM and pg_advisory_lock  (Andrew <archa@pacific.net.au>)
Responses Re: Hibernate / other ORM and pg_advisory_lock  (Craig Ringer <craig@postnewspapers.com.au>)
List pgsql-jdbc
I'm sorry if I've been communicating all this poorly. Even if I've been
muddling you and convincing you that I'm quite crazy, it's been a
helpful conversation for me ;-) . I hope my explanations below make a
bit more sense than the prior ones.

Andrew wrote:

>> Of course, even with advisory locking it's always *possible* that
>> someone else might sneak in an update. They might not check the lock.
>> It might be another app that doesn't know about the advisory locks.
>> Etc. So it's always necessary to use opportunistic or pessimistic
>> transaction level locking/versioning as well, to ensure there are no
>> update collisions, and it's necessary to be able to handle failed
>> updates.
>
> So back to pessimistic locking which I believe we both had agreed was
> not a desirable behaviour, particularly in a user interaction model?

The reference to pessimistic locking was a bit of a red herring. What I
meant was simply "If advisory locking is being used, it is still
necessary to rely on normal locking methods to ensure that conflicting
updates are detected or prevented".

In other words, advisory locking in this situation is used purely to
help the user out, giving them advance warning about potential
concurrent updates and letting them avoid wasting time and effort. It
does not replace normal methods for ensuring data consistency and
integrity, though it does interact with optimistic locking solely in
that it's intended to reduce or eliminate user-visible conflicts.

Advisory locking can be used to detect and avoid situations that would
otherwise result in an optimistic locking conflict requiring user action
like data reentry. It can't (or IMO shouldn't) replace optimistic locking.

An advisory lock allows information about object "busy-ness" to be
communicated among clients outside a transactional scope. This means
that conflicts can be detected and prevented before they happen, instead
of waiting until all the work is done and catching an optimistic locking
conflict.

Often we do not care about a conflict, as the operation can be re-done
behind the user's back without them knowing or caring. Incrementing some
value, for example.

In other cases, there might be time consuming or difficult-to-repeat
changes being made that the program cannot simply retry without asking
the user how to handle it. Changing a notes field on a customer record,
for example. In those cases, it's much nicer to catch the conflict
BEFORE the user does all the work. That's where advisory locking is useful.

(Yes, I know a notes/history field can just be broken down into multiple
entries in a separate table. It's an *example* because the main areas of
concern in the real system would require way too much background).

In a system designed for advanced users I'd quite possibly just let both
users proceed and tell the user who lost the race to redo their changes
and try again. I might even provide a "merge/diff" style summary of the
conflicts between their changes and the winning change. This is,
however, not a system for advanced users, and for them it's a *lot*
nicer to be told somebody else is editing something complicated and be
able to wait.

I wouldn't expose a normal user working with a content management system
to a merge conflict, source revision control system style. I don't know
what your users are like, but mine have a hard time saving email
attachments let alone interpreting change conflict summaries. I'd prefer
to tell them "This item is being edited by another user. You can try
again later, or wait for them to finish.". The situations I'm facing in
this applicatoin are similar to that problem, with similar UI implications.

Additionally, in this situation almost all updates to the areas where
advisory locking will be useful will be done through the tool I'm now
working on, which will honour the advisory locks. There will rarely, if
ever, be updates to this data from DB clients that don't know/care about
the advisory lock, so users will rarely if ever get an error due to a
conflicting update if I use advisory locking to help avoid it happening.

The advisory locking can even be done from a separate JDBC connection,
avoiding any interaction at all with the ORM tools. It's essentially the
same as using a middle layer solely for inter-client messaging, but
without the need for the middle layer. That's one less thing to break,
to my mind.

In most areas advisory locking is neither necessary nor useful. It's
only some particular situations, similar to the examples I've given,
that it's likely to be of use to improve the way multi-user interaction
works. In all other areas using optimistic locking alone should be fine.

As for mixed access by the existing apps and the new tools, I'm now
pretty sure the trigger-based versioning approach will be fine. I'm
going to do a fair bit of testing of course, but it all looks quite
reasonable. It'll just ensure that existing apps that aren't aware of
the row versioning still increment the version, and thus play well with
the ORM tools. The trigger will never change the effect of a statement
run by the ORM tools (because they always increment the version
themselves), and is invisible to them.

I'll drop a short note here when that testing is done, as I imagine it
might be of interest to others.

> The ideal situation would be if
> the DB was only accessed through a single shared access layer by all
> clients using a multi-tiered decoupled design, though I acknowledge that
> is not always possible, particularly with many thick client
> architectures that are tightly coupled two tiered designs at best.

I see a lot of this, and I'm still not sure I understand the thinking
behind it.

The DB is a shared access system. It's designed to handle concurrency,
and does a very good job of it. Adding another layer to hide the DB
seems to be adding unnecessary complexity unless it can provide
something extra like preventing / warning about concurrent changes to
objects outside the scope of DB transactions. However, that's handled
just as well by row versioning without adding another layer to the system.

Unless the additional layer implements the "business logic", using the
DB as a dumb storage engine, I just can't see much of a rationale for
its existence. The proper division of work between app and DB is of
course along standing philosophical argument that won't be resolved in a
hurry. Ditto the utility/necessity of "n-tier" application design. All I
can say is that in this specific case the DB already handles the
business logic, so the midlayer would be superfluous.

Furthermore, the additional layer will either be tied to a particular
technology (eg Java RPC) or rely on clumsy schemes like XML-RPC for
communication.

I can see the point for really big and complex systems that mix many
different data stores. For smaller stuff where the single DB is already
the center of the system and doing a good job there I don't see the
point. It strikes me as being "Enterprise-y" for the sake of it.

> I can understand how you would release the advisory lock by running
> another SQL statement in another thread to do so, but I do not know if
> that is of much use to you.

I assume you meant cancelling an attempted acquisition of the advisory
lock, rather than releasing the lock? Releasing it so it can be acquired
by another connection is just another ordinary SELECT
pg_advisory_unlock(blah). Or closing the JDBC connection, of course.

If there's an advisory lock acquisition attempt blocked, and the user
wants to give up, I can just cancel the statement trying to take out the
advisory lock using the JDBC Statement.cancel() method - see the
attached code in my other reply. That's something that's easily
triggered by user interaction in the UI thread, and involves very little
fuss.

It might not work well with some other JDBC drivers. I've seen reports
of cancel() not doing anything with MySQL's JDBC for example. For this
app that's a non-issue though, it's tied to Pg anyway.

> But from my understanding, that query with the
> back end database process will still be running and the DB connection
> will still be in use, including all of the associated locks and DB
> threads.  So a subsequent attempt on a long running query will also
> block.

Not if you use the JDBC to cancel the query. There's no guarantee that a
big, complex query will be terminated immediately by the backend, but
for something simple like an attempt to acquire an advisory lock
cancelling is immediate and quite clean, at least according to my testing.

> You have to remember that at the ORM level (which is just
> a wrapper to the JDBC), or at the JDBC level, you do not have fine grain
> control of what is happening in the back end, and in my mind you should
> be looking at what JDBC provides, not what the underlying DB may be able
> to provide, and in using pg advisory locks, you are mixing the two.

It's true that using advisory locking would involve mixing two locking
models. That could get very ugly if they had to interact in any sort of
complex way - say, if there were stored procedures or triggers that used
advisory locking and the ordinary transaction-level locking done by Pg.

To me, that says "avoid mixing advisory locking operations in with other
operations that might take normal transaction-level locks" rather than
"don't use advisory locking".

In this case the advisory locking can even be done from a separate JDBC
connection, avoiding any interaction at all with the ORM tools. Locks
can be taken out, tested, and released only before any work is done or
after all work is committed. It's essentially the same as using a middle
layer solely for inter-client messaging, but without the need for the
middle layer.

--
Craig Ringer

pgsql-jdbc by date:

Previous
From: Andrew
Date:
Subject: Re: Hibernate / other ORM and pg_advisory_lock
Next
From: Craig Ringer
Date:
Subject: Re: Hibernate / other ORM and pg_advisory_lock