Thread: UPDATE ... RETURNING atomicity
Hello, In this query: UPDATE foo SET allocated_to=? WHERE id=(SELECT MIN(id) FROM foo WHERE allocated_to IS NULL) AND allocated_to IS NULL RETURNING id Is it guaranteed in any way that there will only be one id allocated and returned even if multiple clients are executing this query concurrently? Or is there a possibility that some other client executing this query (or another query modifying allocated_to) might set allocated_to to non-NULL and commit right after the inner select finds it as NULL, so the outer "AND allocated_to IS NULL" will no longer be true, and the outer query will return nothing? Thanks.
every single query in postrgresql runs as a transaction, on top of it, some are atomic, like when you use RETURNING statement. This is because postgresql doesn't actually have to select these rows as separate query.
On 05/23/2010 03:15 PM, Grzegorz Jaśkiewicz wrote: > every single query in postrgresql runs as a transaction, on top of it, > some are atomic, like when you use RETURNING statement. This is > because postgresql doesn't actually have to select these rows as > separate query. Please note the cooperation of the UPDATE and the inner sub-SELECT query, which was my point.
by default query is wrapped in a transaction (if it is not run under a transaction). And this will be default transaction isolation level. some people think it works magic, but that's not true. find in docs part that talks about transaction isolation levels, and translate it to your problem.
=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= <gryzman@gmail.com> writes: > find in docs part that talks about transaction isolation levels, and > translate it to your problem. Yes, please read the fine manual: http://www.postgresql.org/docs/8.4/static/mvcc.html What I think will happen in your example is that all concurrent executions will locate the same row-to-be-updated. The first one to get to the row "wins" and updates the row. All the rest will fail, either updating no rows (if not serializable) or throwing an error (if serializable). regards, tom lane
On 05/23/2010 08:19 PM, Tom Lane wrote: > =?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?=<gryzman@gmail.com> writes: >> find in docs part that talks about transaction isolation levels, and >> translate it to your problem. > > Yes, please read the fine manual: > http://www.postgresql.org/docs/8.4/static/mvcc.html > > What I think will happen in your example is that all concurrent > executions will locate the same row-to-be-updated. The first one to get > to the row "wins" and updates the row. All the rest will fail, either > updating no rows (if not serializable) or throwing an error (if > serializable). > OK, thank you both, I had hoped that UPDATE would take a table level lock before running the inner select. But then I read that the type of locking done by UPDATE never conflicts with other such locks, so the queries would still run concurrently. We're running the default Read Commited mode. It's no problem for me to rewrite the Perl DBI query to check the return value and loop until it does get something. Which would have better performance: that, or an explicit LOCK on the table before the UPDATE ... SELECT? The transaction is committed shortly after, with no other queries in between. Thank you.
don't lock tables explicitly. That's a killer for (concurrent) performance. Just write queries properly, and use appropriate transaction level. And you are sorted.
On 05/24/2010 01:29 AM, Grzegorz Jaśkiewicz wrote: > don't lock tables explicitly. That's a killer for (concurrent) performance. > Just write queries properly, and use appropriate transaction level. > And you are sorted. Read Committed is fine, as long as I restart the UPDATE query RETURNING nothing. The specifics of our app allow retrying the said query a few times and if it still did not get the id (like during the improbable total ID exhaustion), then pass through, this is considered a tolerable soft error. I suspect retrying just a single query is less expensive than retrying the failed serializable transaction, which is more heavy-weight in nature (and in practice). BTW, regarding your comment on avoiding to use explicit LOCKs: in one place which wasn't speed-sensitive I had to use the strictest LOCK mode because otherwise deadlocks occurred from time to time.
On 05/23/2010 02:15 AM, rihad wrote: > In this query: > UPDATE foo > SET allocated_to=? > WHERE id=(SELECT MIN(id) FROM foo WHERE allocated_to IS NULL) > AND allocated_to IS NULL > RETURNING id Isn't the "AND allocated_to IS NULL" clause redundant? -- Lew