Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order
Date
Msg-id 6197.1256677545@sss.pgh.pa.us
Whole thread Raw
In response to Re: Endgame for all those SELECT FOR UPDATE changes: fix plan node order  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Oct 27, 2009 at 1:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Right, the case would be something like

>>>         select * from
>>>           (select * from foo order by x limit n) ss
>>>         for update of ss;

>> That's a pretty odd construction.

> Dunno why you think that.  That's exactly what one would write if one
> wanted certain operations to execute in a different order than they're
> defined to execute in within a single query level.  We have not
> previously been very clear about the order of operations for FOR UPDATE
> locking relative to other steps, but now we will be.

Actually ... it strikes me that there is another way we could approach
this.  Namely, leave the semantics as-is (FOR UPDATE runs last) and
document that you can do
       select * from         (select * from foo for update) ss       order by x limit n;

if you need FOR UPDATE to run before sorting.  Or perhaps better,
redefine the ordering as ORDER BY then FOR UPDATE then LIMIT.  Swapping
FOR UPDATE and LIMIT has no performance cost and eliminates the worse of
the two complaints in the documentation, without breaking any working
queries AFAICS.  If you have the case where you want to cope with
concurrent updates to the sort key, then you can write the more
complicated query, and it's gonna cost ya.  But that's not a typical
usage, as proven by the fact that it took years to realize there was
a problem there.  So we shouldn't optimize for that usage at the expense
of cases where the sort key isn't expected to change.

It could be argued that this approach doesn't satisfy the principle of
least astonishment as well as doing FOR UPDATE first, but on reflection
I'm not sure I buy that.  The traditional definition has been that we
only lock the rows that are actually returned, and putting FOR UPDATE
underneath the sort will break that expectation.  If it's only underneath
LIMIT we can still meet that expectation.

So I'm liking this more the more I think about it ... and it's also
significantly less work than the other way would be.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Greg Smith
Date:
Subject: Re: Parsing config files in a directory
Next
From: Tom Lane
Date:
Subject: Re: Extraneous newlines in logfile from vacuumdb