Thread: outer joins and for update

outer joins and for update

From

Gavin Sherry

Date:

14 November 2005, 08:24:33

Hi all,

A colleague pointed out to me today that the following is actually
possible on Oracle, MySQL, et al:

template1=# create table a (i int);
CREATE TABLE
template1=# create table b (i int);
CREATE TABLE
template1=# insert into a values(1);
INSERT 0 1
template1=# select * from a left outer join b on (a.i=b.i);i | i
---+---1 |
(1 row)

template1=# select * from a left outer join b on (a.i=b.i) for update of
b;
ERROR:  SELECT FOR UPDATE/SHARE cannot be applied to the nullable side of
an outer join

The comment in initplan.c around line 325 is:
       /*        * Presently the executor cannot support FOR UPDATE/SHARE marking of        * rels appearing on the
nullableside of an outer join. (It's        * somewhat unclear what that would mean, anyway: what should we        *
markwhen a result row is generated from no element of the        * nullable relation?)  So, complain if target rel is
FORUPDATE/SHARE.        * It's sufficient to make this check once per rel, so do it only        * if rel wasn't already
knownnullable.        */
 

As I said, it seems that this is actually possible on other databases.
(MySQL might not be the best example: they seem to take a write lock on
the tables, not a row lock -- tested with Innodb [MyISAM silently ignores
the lock instructions]).

I looked to the spec for instruction on this matter and could find
nothing.

I think we could, in fact, lock rows on the nullable side of the join if
we say that locking the NULL rows is not necessary. The rows do not
physical exist and I could see an argument which says that those rows do
not match any other rows which a concurrent transactions if attempting to
modify -- since they don't exist.

Does anyone have any thoughts on this matter?

Thanks,

Gavin

Re: outer joins and for update

From

Tom Lane

Date:

14 November 2005, 10:24:59

Gavin Sherry <swm@linuxworld.com.au> writes:
> I think we could, in fact, lock rows on the nullable side of the join if
> we say that locking the NULL rows is not necessary. The rows do not
> physical exist and I could see an argument which says that those rows do
> not match any other rows which a concurrent transactions if attempting to
> modify -- since they don't exist.

The point of the comment really is that this is a predicate locking
problem.  I should think that a minimum expectation of SELECT FOR UPDATE
is that you have exclusive hold on the selected rows and they won't
change underneath you before the end of your transaction.  In the case
of an outer join where the left-side row joined to nothing on the
right-side, we can't guarantee that: repeating the SELECT might find a
matching right-side row, thereby changing the allegedly-locked join row.
To guarantee a stable view of the data, we'd need a predicate lock that
prevents a matching right-side row from being inserted.

The fact that MySQL doesn't care about consistency or sane semantics is
no news, of course, but I'm slightly more interested by your claim that
Oracle allows this.  What do they do about the locking issue?
        regards, tom lane

Re: outer joins and for update

From

Gavin Sherry

Date:

14 November 2005, 11:22:25

On Mon, 14 Nov 2005, Tom Lane wrote:

> Gavin Sherry <swm@linuxworld.com.au> writes:
> > I think we could, in fact, lock rows on the nullable side of the join if
> > we say that locking the NULL rows is not necessary. The rows do not
> > physical exist and I could see an argument which says that those rows do
> > not match any other rows which a concurrent transactions if attempting to
> > modify -- since they don't exist.
>
> The point of the comment really is that this is a predicate locking
> problem.  I should think that a minimum expectation of SELECT FOR UPDATE

I thought you might say that. I'm yet to do much reading on predicate
locking -- do you think it is an area we will even pursue?

> is that you have exclusive hold on the selected rows and they won't
> change underneath you before the end of your transaction.  In the case
> of an outer join where the left-side row joined to nothing on the
> right-side, we can't guarantee that: repeating the SELECT might find a
> matching right-side row, thereby changing the allegedly-locked join row.
> To guarantee a stable view of the data, we'd need a predicate lock that
> prevents a matching right-side row from being inserted.

Well.... we can guarantee that we wont see rows added by concurrent
transactions if we're in serializable isolation level :-).

>
> The fact that MySQL doesn't care about consistency or sane semantics is
> no news, of course, but I'm slightly more interested by your claim that
> Oracle allows this.  What do they do about the locking issue?

I wont be able to actually test to see what they do until Thursday at the
earliest. Their manual offers no detail.

Gavin

Re: outer joins and for update

From

Tom Lane

Date:

14 November 2005, 11:32:55

Gavin Sherry <swm@linuxworld.com.au> writes:
> On Mon, 14 Nov 2005, Tom Lane wrote:
>> The point of the comment really is that this is a predicate locking
>> problem.

> I thought you might say that. I'm yet to do much reading on predicate
> locking -- do you think it is an area we will even pursue?

Don't hold your breath ;-) ... AFAICS it's a hard problem and would have
horrid repercussions for performance.
        regards, tom lane

Re: outer joins and for update

From

"Jim C. Nasby"

Date:

14 November 2005, 19:41:58

On Tue, Nov 15, 2005 at 02:22:15AM +1100, Gavin Sherry wrote:
> On Mon, 14 Nov 2005, Tom Lane wrote:
> > is that you have exclusive hold on the selected rows and they won't
> > change underneath you before the end of your transaction.  In the case
> > of an outer join where the left-side row joined to nothing on the
> > right-side, we can't guarantee that: repeating the SELECT might find a
> > matching right-side row, thereby changing the allegedly-locked join row.
> > To guarantee a stable view of the data, we'd need a predicate lock that
> > prevents a matching right-side row from being inserted.
> 
> Well.... we can guarantee that we wont see rows added by concurrent
> transactions if we're in serializable isolation level :-).

Do we really need to prevent inserts from happening under a SELECT FOR
UPDATE? ISTM that's trying to apply serializable concurrency to SELECT
FOR UPDATE even if it's running in a read committed transaction. In the
single table case we don't prevent someone from inserting a value...

# session 1
decibel=# insert into t values('1');
INSERT 633175 1

# session 2
decibel=# begin;
BEGIN
decibel=# select * from t where t='1' for update;t 
---1
(1 row)

# session 1
decibel=# insert into t values('1');
INSERT 633176 1
decibel=# select * from t;t 
---11
(2 rows)

decibel=# update t set t='2';
# Blocks on session 2

Am I missing something here?
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: outer joins and for update

From

Tom Lane

Date:

14 November 2005, 20:38:04

"Jim C. Nasby" <jnasby@pervasive.com> writes:
> Do we really need to prevent inserts from happening under a SELECT FOR
> UPDATE? ISTM that's trying to apply serializable concurrency to SELECT
> FOR UPDATE even if it's running in a read committed transaction. In the
> single table case we don't prevent someone from inserting a value...

You're missing the point entirely, Jim.  In the first place, SELECT FOR
UPDATE has little or nothing to do with serializable mode: it's
guaranteed to lock and return the latest committed version of the row.
In the second case, inserting additional tuples does not invalidate your
lock on the tuples you selected to begin with.  SELECT FOR UPDATE
doesn't try to guarantee that if you were to select again with the same
WHERE condition, there might not be more rows matching the same
condition.  It does try to guarantee that the rows you selected before
are still there and unchanged.

In the case being discussed here, you're trying to lock rows of an
outer-join.  IMHO, if that means anything at all, it means that if
you read those rows again they will still look the same.  Having the
righthand side go from NULL to not-NULL does not qualify as "looking the
same" in my book.

Perhaps this could be clarified if someone has an actual use case of
wanting to SELECT FOR UPDATE from an outer join, and can explain what
semantics they think they need for that.
        regards, tom lane

Re: outer joins and for update

From

"Jim C. Nasby"

Date:

15 November 2005, 12:55:41

On Mon, Nov 14, 2005 at 07:38:00PM -0500, Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > Do we really need to prevent inserts from happening under a SELECT FOR
> > UPDATE? ISTM that's trying to apply serializable concurrency to SELECT
> > FOR UPDATE even if it's running in a read committed transaction. In the
> > single table case we don't prevent someone from inserting a value...
> 
> You're missing the point entirely, Jim.  In the first place, SELECT FOR
> UPDATE has little or nothing to do with serializable mode: it's
> guaranteed to lock and return the latest committed version of the row.
> In the second case, inserting additional tuples does not invalidate your
> lock on the tuples you selected to begin with.  SELECT FOR UPDATE
> doesn't try to guarantee that if you were to select again with the same
> WHERE condition, there might not be more rows matching the same
> condition.  It does try to guarantee that the rows you selected before
> are still there and unchanged.
> 
> In the case being discussed here, you're trying to lock rows of an
> outer-join.  IMHO, if that means anything at all, it means that if
> you read those rows again they will still look the same.  Having the
> righthand side go from NULL to not-NULL does not qualify as "looking the
> same" in my book.

Another way to look at it is that it's not locking rows in a SELECT
clause, it's locking rows in tables. In fact, that's at least supported
by DB2; it allows you to specify field or table names when you do SELECT
FOR UPDATE on a join. That way you can tell it exactly what you expect
to be updating, and what exactly it should be locking.

> Perhaps this could be clarified if someone has an actual use case of
> wanting to SELECT FOR UPDATE from an outer join, and can explain what
> semantics they think they need for that.

Agreed. Hopefully the original author can provide one.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461