Thread: Re: [BUGS] Status of issue 4593

Re: [BUGS] Status of issue 4593

From

Peter Eisentraut

Date:

12 January 2009, 09:27:07

Peter Eisentraut wrote:
> On Tuesday 06 January 2009 02:03:14 Tom Lane wrote:
>> I don't think there's a bug here, at least not in the sense that it
>> isn't Operating As Designed.  But it does seem like we could do with
>> some more/better documentation about exactly how FOR UPDATE works.
>> The sequence of operations is evidently a bit more user-visible than
>> I'd realized.
> 
> Well, if the effect of ORDER BY + FOR UPDATE is "it might in fact not be 
> ordered", then it's pretty broken IMO.  It would be pretty silly by analogy 
> for example, if the effect of GROUP BY + FOR UPDATE were "depending on 
> concurrent events, it may or may not be fully grouped".

I can see two ways forward:

1) We document bluntly that ORDER BY + FOR UPDATE can return unordered 
results, or

2) We prohibit ORDER BY + FOR UPDATE, like we do with a number of other 
clauses.  (There would be no loss of functionality, because you can run 
the query a second time in the transaction with ORDER BY.)

Comments?

Re: [BUGS] Status of issue 4593

From

Tom Lane

Date:

12 January 2009, 09:32:51

Peter Eisentraut <peter_e@gmx.net> writes:
> I can see two ways forward:

> 1) We document bluntly that ORDER BY + FOR UPDATE can return unordered 
> results, or

> 2) We prohibit ORDER BY + FOR UPDATE, like we do with a number of other 
> clauses.  (There would be no loss of functionality, because you can run 
> the query a second time in the transaction with ORDER BY.)

That code has been working like this for eight or ten years now and this
is the first complaint, so taking away functionality on the grounds that
someone might happen to update the ordering column doesn't seem like the
answer to me.
        regards, tom lane

Re: [BUGS] Status of issue 4593

From

Gregory Stark

Date:

12 January 2009, 10:03:22

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Peter Eisentraut <peter_e@gmx.net> writes:
>> I can see two ways forward:
>
>> 1) We document bluntly that ORDER BY + FOR UPDATE can return unordered 
>> results, or
>
>> 2) We prohibit ORDER BY + FOR UPDATE, like we do with a number of other 
>> clauses.  (There would be no loss of functionality, because you can run 
>> the query a second time in the transaction with ORDER BY.)
>
> That code has been working like this for eight or ten years now and this
> is the first complaint, so taking away functionality on the grounds that
> someone might happen to update the ordering column doesn't seem like the
> answer to me.

Can we detect it at run-time? If a recheck happens can we somehow know which
columns could be problematic to find updated and check that they're unchanged?
I'm pretty sure the answer is no, but I figured I would throw it out there in
case it gives anyone an idea.


--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!

Re: [BUGS] Status of issue 4593

From

"Lee McKeeman"

Date:

12 January 2009, 11:54:59

While this behavior may be very old, I would still contend that it is
incorrect (or at least inconsistent with one's expectations). If it will
not be changed, some additional documentation might be helpful. Perhaps
a WARNING could be raised (unconditionally, as it might be a bit
intensive to detect when the problem has occurred)?

It may be nice, though it would obviously be an extension, to have
session-level settings that will cause 2 queries to be run
independently, first the for update, second the order by when both are
present, so one doesn't have to handle this every place they want to
perform queries with these two clauses. Alternatively, since two queries
is less efficient, perhaps if this setting is active a temporary
function can be created to implement the work-around Tom mentioned
earlier in the thread automatically (though perhaps creating and
compiling the function would be slower than the second select in some
cases). Again, this would allow for correct behavior without a great
deal of schema and code modification.

I guess my real point is that while no one else has complained about it,
I'm confident others do and will continue to use these two clauses in
the same query. In general, when one uses ORDER BY, they expect their
results ordered based on the data returned. Being able to explain the
reason for this inconsistency is nice, but it certainly doesn't
invalidate the problem.

-Lee

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, January 12, 2009 7:33 AM
To: Peter Eisentraut
Cc: PG Hackers; Jeff Davis; Lee McKeeman
Subject: Re: [BUGS] Status of issue 4593

Peter Eisentraut <peter_e@gmx.net> writes:
> I can see two ways forward:

> 1) We document bluntly that ORDER BY + FOR UPDATE can return unordered

> results, or

> 2) We prohibit ORDER BY + FOR UPDATE, like we do with a number of
other
> clauses.  (There would be no loss of functionality, because you can
run
> the query a second time in the transaction with ORDER BY.)

That code has been working like this for eight or ten years now and this
is the first complaint, so taking away functionality on the grounds that
someone might happen to update the ordering column doesn't seem like the
answer to me.
        regards, tom lane

Re: [BUGS] Status of issue 4593

From

Jeff Davis

Date:

12 January 2009, 13:41:26

On Mon, 2009-01-12 at 15:26 +0200, Peter Eisentraut wrote:
> 1) We document bluntly that ORDER BY + FOR UPDATE can return unordered 
> results, or
> 
> 2) We prohibit ORDER BY + FOR UPDATE, like we do with a number of other 
> clauses.  (There would be no loss of functionality, because you can run 
> the query a second time in the transaction with ORDER BY.)
> 

I like Lee's idea of a WARNING plus a documentation note -- seems like a
reasonable compromise. Maybe we can add the prohibition later if we
still don't have a fix for it.

Regards,Jeff Davis

Re: [BUGS] Status of issue 4593

From

"Robert Haas"

Date:

12 January 2009, 13:47:24

On Mon, Jan 12, 2009 at 8:32 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
>> I can see two ways forward:
>
>> 1) We document bluntly that ORDER BY + FOR UPDATE can return unordered
>> results, or
>
>> 2) We prohibit ORDER BY + FOR UPDATE, like we do with a number of other
>> clauses.  (There would be no loss of functionality, because you can run
>> the query a second time in the transaction with ORDER BY.)
>
> That code has been working like this for eight or ten years now and this
> is the first complaint, so taking away functionality on the grounds that
> someone might happen to update the ordering column doesn't seem like the
> answer to me.

If the only case where ORDER BY + FOR UPDATE are not strictly
compatible is when the columns being updated are the same as the
columns of the sort, a blanket prohibition against using the two
together seems like it prohibits an awful lot of useful things someone
might want to do.  Saying that you can run the query a second time as
a workaround so there's no loss of functionality is true only if you
accept the proposition that performance is not a requirement.

...Robert

Re: [BUGS] Status of issue 4593

From

Jeff Davis

Date:

12 January 2009, 13:52:28

On Mon, 2009-01-12 at 08:32 -0500, Tom Lane wrote:
> That code has been working like this for eight or ten years now and this
> is the first complaint, so taking away functionality on the grounds that
> someone might happen to update the ordering column doesn't seem like the
> answer to me.
> 

If they are using FOR UPDATE, they clearly expect concurrent updates. If
they're using ORDER BY, they clearly expect the results to be in order.

So who is the target user of this functionality we're trying to protect?

Regards,Jeff Davis

Re: [BUGS] Status of issue 4593

From

Jeff Davis

Date:

12 January 2009, 14:04:04

On Mon, 2009-01-12 at 12:47 -0500, Robert Haas wrote:
> If the only case where ORDER BY + FOR UPDATE are not strictly
> compatible is when the columns being updated are the same as the
> columns of the sort, a blanket prohibition against using the two
> together seems like it prohibits an awful lot of useful things someone
> might want to do.

As long as the people using it are aware that they can't update the
ordering columns, it may make sense to leave that functionality in
there.

Can you expand on "an awful lot of useful things"? It seems like an edge
case to me, and the restriction it imposes is quite awkward to meet.
"OK, nobody ever update these fields in this table.".

Regards,Jeff Davis

Re: [BUGS] Status of issue 4593

From

Tom Lane

Date:

12 January 2009, 14:35:39

"Robert Haas" <robertmhaas@gmail.com> writes:
> On Mon, Jan 12, 2009 at 8:32 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> That code has been working like this for eight or ten years now and this
>> is the first complaint, so taking away functionality on the grounds that
>> someone might happen to update the ordering column doesn't seem like the
>> answer to me.

> If the only case where ORDER BY + FOR UPDATE are not strictly
> compatible is when the columns being updated are the same as the
> columns of the sort, a blanket prohibition against using the two
> together seems like it prohibits an awful lot of useful things someone
> might want to do.

Exactly.

> Saying that you can run the query a second time as
> a workaround so there's no loss of functionality is true only if you
> accept the proposition that performance is not a requirement.

Right, and also I'm unconvinced that it is really equivalent.  If you've
got something like an ORDER BY LIMIT and the ordering columns are
changing, you may very well get a different set of rows from the second
query ... not all of which got locked the first time, so there's going
to be some tail-chasing involved there.

A re-sort after locking doesn't really make things all nice and
intuitive either.  Suppose that the values of X are 10,20,30,40,50
and we do SELECT ... ORDER BY x LIMIT 3 FOR UPDATE.  Concurrently
someone updates the 20 to 100.  The existing code locks the 10,20,30
rows, then notices the 20 got updated to 100, and returns you
10,100,30.  If it re-sorted you would get 10,30,100, but on what
grounds is that the correct answer and not 10,20,40?  If you want
to argue that 10,20,40 is the correct answer, how are you going to
arrive at it without locking more rows than are returned?

And just to bend your brain a bit more, what if the same update command
that changed 20 to 100 also changed 50 to 1?  Surely if we take the one
row change into account in determining the sort order, we ought to
notice that one too.  However, we aren't even going to *see* that row
unless we traverse the entire table.

I think the behavior Lee is expecting is only implementable with a
full-table write lock, which is exactly what FOR UPDATE is designed
to avoid.  There are certain properties you don't get with a partial
lock, and in the end I think we can't do much except document them.
We have LOCK TABLE for those who need the other behavior.
        regards, tom lane

Re: [BUGS] Status of issue 4593

From

"Kevin Grittner"

Date:

12 January 2009, 14:45:51

>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> A re-sort after locking doesn't really make things all nice and
> intuitive either.  Suppose that the values of X are 10,20,30,40,50
> and we do SELECT ... ORDER BY x LIMIT 3 FOR UPDATE.  Concurrently
> someone updates the 20 to 100.  The existing code locks the 10,20,30
> rows, then notices the 20 got updated to 100, and returns you
> 10,100,30.  If it re-sorted you would get 10,30,100, but on what
> grounds is that the correct answer and not 10,20,40?  If you want
> to argue that 10,20,40 is the correct answer, how are you going to
> arrive at it without locking more rows than are returned?
Would it make any sense to roll back and generate a
SERIALIZATION_FAILURE?  In other databases I've seen this failure
code used for all situations where they can't perform the requested
operation due to concurrent transactions, regardless of transaction
isolation level.
That seems more intuitive to me than any of the alternatives discussed
so far, which all seem to involve quietly returning something other
than what was specified.
-Kevin

Re: [BUGS] Status of issue 4593

From

Tom Lane

Date:

12 January 2009, 14:51:33

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
>> A re-sort after locking doesn't really make things all nice and
>> intuitive either.

> Would it make any sense to roll back and generate a
> SERIALIZATION_FAILURE?

If that's what you want then you run the transaction in serializable
mode.  The point of doing it in READ COMMITTED mode is that you don't
want such a failure.
        regards, tom lane

Re: [BUGS] Status of issue 4593

From

"Kevin Grittner"

Date:

12 January 2009, 15:02:06

>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
>>> A re-sort after locking doesn't really make things all nice and
>>> intuitive either.
> 
>> Would it make any sense to roll back and generate a
>> SERIALIZATION_FAILURE?
> 
> If that's what you want then you run the transaction in serializable
> mode.  The point of doing it in READ COMMITTED mode is that you
don't
> want such a failure.
Well, that's a PostgreSQL-specific point of view, although I
understand the point of maintaining that guarantee.  (In Microsoft SQL
Server and Sybase ASE we actually had to run our read-only web
application at the READ UNCOMMITTED transaction isolation level
because so many SELECT queries were rolled back when they deadlocked
with the traffic from replication when they were all running at READ
COMMITTED.)
If you run this at SERIALIZABLE transaction isolation level, would
PostgreSQL currently roll something back before returning rows in an
order different than that specified by the ORDER BY clause?
-Kevin

Re: [BUGS] Status of issue 4593

From

Jeff Davis

Date:

12 January 2009, 15:12:40

On Mon, 2009-01-12 at 13:35 -0500, Tom Lane wrote:
> I think the behavior Lee is expecting is only implementable with a
> full-table write lock, which is exactly what FOR UPDATE is designed
> to avoid.  There are certain properties you don't get with a partial
> lock, and in the end I think we can't do much except document them.
> We have LOCK TABLE for those who need the other behavior.
> 

Lee said specifically that he's not using LIMIT, and there's already a
pretty visible warning in the docs for using LIMIT with FOR UPDATE.
Also, using LIMIT + FOR UPDATE has a dangerous-looking quality to it (at
least to me) that would cause me to do a little more investigation
before relying on its behavior.

I'm not pushing for FOR UPDATE + ORDER BY to be blocked outright, but I
think it's strange enough that it should be considered some kind of
defect worse than the cases involving LIMIT that you mention.

Regards,Jeff Davis

Re: [BUGS] Status of issue 4593

From

Tom Lane

Date:

12 January 2009, 15:21:33

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
>> If that's what you want then you run the transaction in serializable
>> mode.
> If you run this at SERIALIZABLE transaction isolation level, would
> PostgreSQL currently roll something back before returning rows in an
> order different than that specified by the ORDER BY clause?

Yes, it would roll back as soon as it found that any of the selected
rows had been concurrently modified.  The behavior where you get back
the updated version of the row is specific to READ COMMITTED mode.
        regards, tom lane

Re: [BUGS] Status of issue 4593

From

Peter Eisentraut

Date:

13 January 2009, 05:19:17

Tom Lane wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
>>> A re-sort after locking doesn't really make things all nice and
>>> intuitive either.
> 
>> Would it make any sense to roll back and generate a
>> SERIALIZATION_FAILURE?
> 
> If that's what you want then you run the transaction in serializable
> mode.  The point of doing it in READ COMMITTED mode is that you don't
> want such a failure.

Well, you can get deadlocks in read committed mode, so it is not like 
this mode is totally free of concurrency related failure possibilities.

Both serialization errors and deadlocks assume a write operation though.

But could we detect this case at all?  That is, when we are re-reading 
the updated tuple, do we remember that we did some sorting earlier?

Re: [BUGS] Status of issue 4593

From

Peter Eisentraut

Date:

13 January 2009, 05:22:24

Kevin Grittner wrote:
> Well, that's a PostgreSQL-specific point of view, although I
> understand the point of maintaining that guarantee.  (In Microsoft SQL
> Server and Sybase ASE we actually had to run our read-only web
> application at the READ UNCOMMITTED transaction isolation level
> because so many SELECT queries were rolled back when they deadlocked
> with the traffic from replication when they were all running at READ
> COMMITTED.)

Per SQL standard, READ UNCOMMITTED mode requires READ ONLY transaction 
access mode, so you couldn't do FOR UPDATE there anyway.  (Of course, 
FOR UPDATE is not in the standard.)

> If you run this at SERIALIZABLE transaction isolation level, would
> PostgreSQL currently roll something back before returning rows in an
> order different than that specified by the ORDER BY clause?

Yes, but using FOR UPDATE is kind of pointless in serializable mode.

Re: [BUGS] Status of issue 4593

From

"Kevin Grittner"

Date:

13 January 2009, 12:19:20

>>> Peter Eisentraut <peter_e@gmx.net> wrote: 
> Kevin Grittner wrote:
>> (In Microsoft SQL Server and Sybase ASE we actually had to run our
>> read-only web application at the READ UNCOMMITTED transaction
>> isolation level because so many SELECT queries were rolled back
>> when they deadlocked with the traffic from replication when they
>> were all running at READ COMMITTED.)
> 
> Per SQL standard, READ UNCOMMITTED mode requires READ ONLY
transaction 
> access mode, so you couldn't do FOR UPDATE there anyway.
My only point was that other DBMSs often generate serialization
failures on SELECT-only transactions in READ COMMITTED mode. 
Sometimes quite a few of them.  I also recognized that if PostgreSQL
can provide guarantees not required by the standard that such things
won't happen, I can see the value of that.
>> If you run this at SERIALIZABLE transaction isolation level, would
>> PostgreSQL currently roll something back before returning rows in
an
>> order different than that specified by the ORDER BY clause?
> 
> Yes, but using FOR UPDATE is kind of pointless in serializable mode.
Well, for transactions which only SELECT, sure.  Is there no use case
for them outside of that?  (That's not rhetorical -- I'm not really
sure.)
-Kevin

Re: [BUGS] Status of issue 4593

From

"Kevin Grittner"

Date:

13 January 2009, 13:00:27

>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Would it make any sense to roll back and generate a
>> SERIALIZATION_FAILURE?
> 
> If that's what you want then you run the transaction in serializable
> mode.  The point of doing it in READ COMMITTED mode is that you
> don't want such a failure.
Wait a minute -- there is not such guarantee in PostgreSQL when you
start using WITH UPDATE on SELECT statements in READ COMMITTED mode. 
By starting two transactions in READ COMMITTED, and having each do two
SELECTs WITH UPDATE (in opposite order) I was able to generate this:
ERROR:  deadlock detected
DETAIL:  Process 4945 waits for ShareLock on transaction 20234373;
blocked by process 5185.
Process 5185 waits for ShareLock on transaction 20233798; blocked by
process 4945.
So, wouldn't it be better, if it's actually feasible to detect the
problem situation, to make this another situation where SELECT FOR
UPDATE can cause serialization failures?  That would allow
applications to count on getting the rows in the requested order if
the query completes successfully.  If existing documentation doesn't
already cover the possibility of serialization failures when using FOR
UPDATE, it should.  If we need to document something around the issue
of this thread, that seems like the place to do it.
-Kevin

Re: [BUGS] Status of issue 4593

From

Tom Lane

Date:

13 January 2009, 13:16:49

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
>> If that's what you want then you run the transaction in serializable
>> mode.  The point of doing it in READ COMMITTED mode is that you
>> don't want such a failure.
> Wait a minute -- there is not such guarantee in PostgreSQL when you
> start using WITH UPDATE on SELECT statements in READ COMMITTED mode. 
> By starting two transactions in READ COMMITTED, and having each do two
> SELECTs WITH UPDATE (in opposite order) I was able to generate this:
> ERROR:  deadlock detected

Huh?  Deadlocks were not the issue here.  What you asked for was a
failure if someone else had updated the rows you're selecting for
update.
        regards, tom lane

Re: [BUGS] Status of issue 4593

From

"Kevin Grittner"

Date:

13 January 2009, 13:29:10

>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: 
> Huh?  Deadlocks were not the issue here.  What you asked for was a
> failure if someone else had updated the rows you're selecting for
> update.
Logically, these are both forms of serialization failure when doing
SELECT FOR UPDATE in READ COMMITTED mode.  One currently deadlocks,
generating an error that requires a retry.  The other quietly fails to
return the requested results.  I'm suggesting that it would be better
to generate an error with an indication of the serialization failure. 
You said that people use READ COMMITTED to avoid such errors.  I'm
pointing out that they can currently get serialization failure errors
in READ COMMITTED if they choose to SELECT FOR UPDATE.
-Kevin

Re: [BUGS] Status of issue 4593

From

Josh Berkus

Date:

13 January 2009, 14:07:11

Kevin,

> So, wouldn't it be better, if it's actually feasible to detect the
> problem situation, to make this another situation where SELECT FOR
> UPDATE can cause serialization failures?  That would allow
> applications to count on getting the rows in the requested order if
> the query completes successfully.  If existing documentation doesn't
> already cover the possibility of serialization failures when using FOR
> UPDATE, it should.  If we need to document something around the issue
> of this thread, that seems like the place to do it.

That's not how SELECT FOR UPDATE works.  SFU is pessimistic manual 
locking, which is supposed to *wait* for the rows to be exclusively 
available.  The deadlock timeout you encountered is the correct 
behaviour, not "serialization failure", which is what happens at commit 
time when the engine realizes that concurrent transactions are not 
serializably recreateable.

--Josh

Re: [BUGS] Status of issue 4593

From

"Kevin Grittner"

Date:

13 January 2009, 14:18:59

>>> Josh Berkus <josh@agliodbs.com> wrote: 
> That's not how SELECT FOR UPDATE works.  SFU is pessimistic manual 
> locking, which is supposed to *wait* for the rows to be exclusively 
> available.  The deadlock timeout you encountered is the correct 
> behaviour, not "serialization failure", which is what happens at
commit 
> time when the engine realizes that concurrent transactions are not 
> serializably recreateable.
Deadlocks like this are the only kind of serialization error possible
under "traditional" (non-MVCC) databases.  These are much more rare in
MVCC than update conflicts, but that doesn't mean they aren't
serialization failures there, too.  I think it is a violation of the
standard for PostgreSQL not to report them with SQLSTATE '40001'.
-Kevin

Re: [BUGS] Status of issue 4593

From

Josh Berkus

Date:

13 January 2009, 14:35:06

> Deadlocks like this are the only kind of serialization error possible
> under "traditional" (non-MVCC) databases.  These are much more rare in
> MVCC than update conflicts, but that doesn't mean they aren't
> serialization failures there, too.  I think it is a violation of the
> standard for PostgreSQL not to report them with SQLSTATE '40001'.

I'm not sure that inductive reasoning applies to the SQL standard.  And 
we'd break 100,000 existing Java applications if we changed the error. 
In my opinion, this falls under the heading of "it would be a nice thing 
to do if we were starting over, but not now."

--Josh

Re: [BUGS] Status of issue 4593

From

"Kevin Grittner"

Date:

13 January 2009, 14:44:03

>>> Josh Berkus <josh@agliodbs.com> wrote: 
> we'd break 100,000 existing Java applications if we changed the
error. 
In what way would an application want to treat deadlocks and update
conflicts differently?  Both result from conflicts with concurrent
transactions and can be retried automatically.  It seems like an
implementation detail with little chance of impact on applications to
me.  Can anyone provide a contrary example or argument?
-Kevin

Re: [BUGS] Status of issue 4593

From

Gregory Stark

Date:

13 January 2009, 15:00:20

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

>>>> Josh Berkus <josh@agliodbs.com> wrote: 
>> we'd break 100,000 existing Java applications if we changed the
> error. 
>  
> In what way would an application want to treat deadlocks and update
> conflicts differently?  Both result from conflicts with concurrent
> transactions and can be retried automatically.  It seems like an
> implementation detail with little chance of impact on applications to
> me.  Can anyone provide a contrary example or argument?

Well generally deadlocks are treated differently in that they are treated by
rewriting the application to not cause deadlocks.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!

Re: [BUGS] Status of issue 4593

From

"Kevin Grittner"

Date:

13 January 2009, 15:03:37

>>> Gregory Stark <stark@enterprisedb.com> wrote: 
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> In what way would an application want to treat deadlocks and update
>> conflicts differently?  Both result from conflicts with concurrent
>> transactions and can be retried automatically.  It seems like an
>> implementation detail with little chance of impact on applications
to
>> me.  Can anyone provide a contrary example or argument?
> 
> Well generally deadlocks are treated differently in that they are
treated by
> rewriting the application to not cause deadlocks.
I certainly don't propose changing the PostgreSQL error number or the
content of what is logged.  Just the SQLSTATE.  How would that make
what you suggest harder?  It would certainly allow applications and
frameworks which are SQLSTATE-aware to automatically recover from
these until the rewrite is complete, which can hardly be considered a
bad thing.
-Kevin

Re: [BUGS] Status of issue 4593

From

Bruce Momjian

Date:

22 January 2009, 18:58:33

Jeff Davis wrote:
> On Mon, 2009-01-12 at 13:35 -0500, Tom Lane wrote:
> > I think the behavior Lee is expecting is only implementable with a
> > full-table write lock, which is exactly what FOR UPDATE is designed
> > to avoid.  There are certain properties you don't get with a partial
> > lock, and in the end I think we can't do much except document them.
> > We have LOCK TABLE for those who need the other behavior.
> >
>
> Lee said specifically that he's not using LIMIT, and there's already a
> pretty visible warning in the docs for using LIMIT with FOR UPDATE.
> Also, using LIMIT + FOR UPDATE has a dangerous-looking quality to it (at
> least to me) that would cause me to do a little more investigation
> before relying on its behavior.
>
> I'm not pushing for FOR UPDATE + ORDER BY to be blocked outright, but I
> think it's strange enough that it should be considered some kind of
> defect worse than the cases involving LIMIT that you mention.

I have added the attached documentation mention to CVS HEAD and 8.3.X.
If people want a TODO entry or to issue a WARNING message on use, please
let me know.  This does seem similar to the FOR UPDATE / LIMIT issue so
I handled it similarly.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/ref/select.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/select.sgml,v
retrieving revision 1.117
diff -c -c -r1.117 select.sgml
*** doc/src/sgml/ref/select.sgml    12 Jan 2009 14:06:20 -0000    1.117
--- doc/src/sgml/ref/select.sgml    22 Jan 2009 22:50:20 -0000
***************
*** 1162,1177 ****
    <caution>
     <para>
      It is possible for a <command>SELECT</> command using both
!     <literal>LIMIT</literal> and  <literal>FOR UPDATE/SHARE</literal>
      clauses to return fewer rows than specified by <literal>LIMIT</literal>.
      This is because <literal>LIMIT</> is applied first.  The command
      selects the specified number of rows,
!     but might then block trying to obtain lock on one or more of them.
      Once the <literal>SELECT</> unblocks, the row might have been deleted
      or updated so that it does not meet the query <literal>WHERE</> condition
      anymore, in which case it will not be returned.
     </para>
    </caution>
    </refsect2>

    <refsect2 id="SQL-TABLE">
--- 1162,1192 ----
    <caution>
     <para>
      It is possible for a <command>SELECT</> command using both
!     <literal>LIMIT</literal> and <literal>FOR UPDATE/SHARE</literal>
      clauses to return fewer rows than specified by <literal>LIMIT</literal>.
      This is because <literal>LIMIT</> is applied first.  The command
      selects the specified number of rows,
!     but might then block trying to obtain a lock on one or more of them.
      Once the <literal>SELECT</> unblocks, the row might have been deleted
      or updated so that it does not meet the query <literal>WHERE</> condition
      anymore, in which case it will not be returned.
     </para>
    </caution>
+
+   <caution>
+    <para>
+     Similarly, it is possible for a <command>SELECT</> command
+     using <literal>ORDER BY</literal> and <literal>FOR
+     UPDATE/SHARE</literal> to return rows out of order.  This is
+     because <literal>ORDER BY</> is applied first.  The command
+     orders the result, but might then block trying to obtain a lock
+     on one or more of the rows.  Once the <literal>SELECT</>
+     unblocks, one of the ordered columns might have been modified
+     and be returned out of order.  A workaround is to perform
+     <command>SELECT ... FOR UPDATE/SHARE</> and then <command>SELECT
+     ... ORDER BY</>.
+    </para>
+   </caution>
    </refsect2>

    <refsect2 id="SQL-TABLE">