select from update from select? - Mailing list pgsql-sql

From Dave Gomboc
Subject select from update from select?
Date
Msg-id Pine.LNX.4.44.0302180139210.30967-100000@boyne.cs.ualberta.ca
Whole thread Raw
Responses Re: select from update from select?  (Richard Huxton <dev@archonet.com>)
List pgsql-sql
On Mon, 17 Feb 2003, Tambet Matiisen wrote:

> Why not just store the work_order_id in temporary variable? You need to do
> SELECT FOR UDPATE in this case, to avoid reserving the same work twice.

This doesn't work -- for the reason you give:
> While testing this, I discovered, that it doesn't work exactly as I
> expected. Suppose one transaction locks row with work_queue_id = 1. Now
> before the first transactions has finished, second comes in and tries to
> get lock for the same row. SELECT FOR UPDATE in second transaction
> blocks as expected. But when first transaction finishes, the SELECT in
> second transaction returns 0 rows, not next row that satisfies the
> conditions. I'm not sure if this should be considered bug or feature.
> But you have to test if the query returned any rows anyway, because the
> same happens when there are no unreserved works in queue.

It's definitely a bug to me.  The purpose of the work_queue is hold the
pool of tasks to be done, and to allow different machines to indicate that
they will do a certain task without any master program to tell machines
what work to do.  With the above process, concurrent attempts block
(except for one), then they don't even get a row!  I definitely want
concurrent requests to not haggle over the same row, there'd be way too
much contention between machines, and it defeats the whole purpose of more
than one job being available simultaneously.  If PostgreSQL had a SKIP
LOCKED feature, I'd be in business, but (AFAIK) it doesn't?  Using NOWAIT
and doing a large number of retries using some random back-off scheme
seems like a big hack rather than something that would actually work
acceptably.

> If it is required, that calling this function always reserves one row, then
> you should LOCK whole table before doing SELECT. You don't have to use FOR
> UPDATE in this case.
> 
> lock work_queue_table share row exclusive;
> 
> select into temp_queue_id work_queue_id from work_queue,
> optimization_task_table
>     where reservation_time is null
>     and concordance is null
>     order by priority descending limit 1;
> 
> if found then
>     update work_queue_table set worker_id = 5,
>         reservation_time = 'now()'
>         where work_queue_id = temp_queue_id;
> end if;
> 
> Table level lock is released automatically when transaction ends. Of course
> if you don't do concurrent access, then you can just leave out both LOCK and
> FOR UPDATE.

Hmm.  I didn't consider locking the entire table.  I think this would be a
better way to go than the NOWAIT and retry-like-crazy?  At least this way
each concurrent request will grab unique work until such time as there is
no work left.  It's too bad that everything will stall, because the
actions don't need to be serialized, but I guess you can't have
everything.

Dave



pgsql-sql by date:

Previous
From: Josh Berkus
Date:
Subject: Re: How to make sequence skip existing key in table?
Next
From: Hans-Jürgen Schönig
Date:
Subject: Re: Trigger that will execute external program