On Oct 25, 2009, at 10:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Now that we've got a hopefully-non-broken implementation of SELECT FOR
> UPDATE locking as a plan node, we can finally contemplate fixing two
> misbehaviors that are called out on the SELECT reference page:
>
> It is possible for a SELECT command using both LIMIT and FOR
> UPDATE/SHARE clauses to return fewer rows than specified by
> LIMIT. This
> is because LIMIT is applied first. The command selects the
> specified
> number of rows, but might then block trying to obtain a lock on
> one or
> more of them. Once the SELECT unblocks, the row might have been
> deleted
> or updated so that it does not meet the query WHERE condition
> anymore,
> in which case it will not be returned.
>
> Similarly, it is possible for a SELECT command using ORDER BY and
> FOR
> UPDATE/SHARE to return rows out of order. This is because ORDER
> BY is
> applied first. The command orders the result, but might then block
> trying to obtain a lock on one or more of the rows. Once the SELECT
> unblocks, one of the ordered columns might have been modified and
> be
> returned out of order. A workaround is to perform SELECT ... FOR
> UPDATE/SHARE and then SELECT ... ORDER BY.
>
> All that we have to do to fix the first one is to put the LockRows
> node
> below the Limit node instead of above it. The solution for the second
> one is to also put LockRows underneath the Sort node, and to regard
> its
> output as unsorted so that a Sort node will certainly be generated.
> (This in turn implies that we should prefer the cheapest-total plan
> for the rest of the query.)
This seems like it could potentially introduce a performance
regression, but the current behavior is so bizarre that it seems like
we should still change it.
> Does anyone have any objections to this? I can't see that it will
> break
> any applications that work today, but maybe I'm missing something.
I'm pretty excited about this. It is a nifty piece of foot-gun
removal. Thanks!
...Robert