Re: Select For Update and Left Outer Join - Mailing list pgsql-hackers

From Patrick Earl
Subject Re: Select For Update and Left Outer Join
Date
Msg-id BANLkTinZ1V4CUMzHHJqACZO1RwZUZ=5Q3A@mail.gmail.com
Whole thread Raw
In response to Re: Select For Update and Left Outer Join  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Select For Update and Left Outer Join  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, May 1, 2011 at 9:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Addition of new rows certainly isn't supposed to be prevented by a
> SELECT FOR UPDATE, but that's not the problem here.  What I *do* expect
> a SELECT FOR UPDATE to promise is that the rows it did return can't
> change or be deleted by someone else for the life of my transaction.
> This is not possible to promise for null-extended rows unless you
> somehow lock out addition of a matching row on the inside of the join.
> Without that, a row that looked like <pet fields, nulls> when you
> selected it might suddenly start looking like <pet fields, cat fields>
> due to someone else's modification.  And after that, since you still
> haven't got a lock on the cat row, the cat fields could keep on
> changing.
>
> I'm prepared to believe that there are some applications where that
> can't happen due to other interlocking, or doesn't matter to the
> application, but SELECT FOR UPDATE really can't assume that.  I think
> what you're proposing is to fundamentally break the semantics of SELECT
> FOR UPDATE for the sake of convenience.

While I don't have access to the SQL standard to check if the
semantics are documented, the semantics don't seem clear cut.  The
question is whether the thing that you've promised won't change is the
result row of the query or the source rows that built the result.  I
would like to know how other databases handle this, but I do know that
it works on MSSQL.  Perhaps it's using something like the source row
locking semantics I mentioned.

> You didn't explain exactly why your application doesn't care about this,
> but I wonder whether it's because you know that a lock on the parent
> "pet" row is sufficient due to application coding rules.  If so, you
> could just tell SELECT FOR UPDATE to only lock the "pet" rows, and be
> happy:
>
>        select * from pet left join cat ... for update of pet;

If I select a collection that contains both dogs and cats, run a
polymorphic operation that affects dogs and cats differently, then
save my results back, I would need locks on all rows, not just the pet
information.  When "parts" of a "row" are stored in different tables,
the results from both tables need to be locked.

I see a couple possible workarounds, neither of which are particularly
appealing:   * Run many queries, being careful to ensure the appropriate rows
from each table are locked.   * Lock all related tables.

Perhaps the syntax could be extended to indicate to the database that
it's safe to lock on just the selected rows.
      select * from pet left join cat ... for update of pet, cat;

On a conceptual level, I still tend to think of select for update as
"locking the data returned by the query."  If no data is returned by
the query, I don't get a lock on that non-data.  Is this an area that
is covered by the standard?

Thanks for your thoughts on this Tom.  Your time is much appreciated.
       Patrick Earl


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Select For Update and Left Outer Join
Next
From: Tom Lane
Date:
Subject: Re: Select For Update and Left Outer Join