Re: Select for update with offset interferes with concurrent transactions - Mailing list pgsql-general

From Andy Colson
Subject Re: Select for update with offset interferes with concurrent transactions
Date
Msg-id 4D483EBB.2090307@squeakycode.net
Whole thread Raw
In response to Re: Select for update with offset interferes with concurrent transactions  ("Yngve Nysaeter Pettersen" <yngve@opera.com>)
Responses Re: Select for update with offset interferes with concurrent transactions
List pgsql-general
On 2/1/2011 10:59 AM, Yngve Nysaeter Pettersen wrote:
> Hi,
>
> Thanks for the quick answer, Andy.
>
> On Tue, 01 Feb 2011 16:19:17 +0100, Andy Colson <andy@squeakycode.net>
> wrote:
>
> <snip>
>> So, if I understand correctly, you:
>>
>> q = SELECT record_id FROM queue
>> WHERE project_id = my_project AND state = idle
>> LIMIT n OFFSET i FOR UPDATE
>> while not q.eof
>> update queue set state = started where record_id = x;
>> process record_id
>> update queue set state = finsihed where record_id = x;
>> q.next;
>
> Almost, the update to "started" is done for all selected elements first,
> releasing the lock, then the items are processed one at a time, marking
> each "finished" as they complete. (each processing step can take
> minutes, so keeping a lock the whole time is not an option)
>
>> Might I suggest and alternative:
>>
>> q = update queue set state = started
>> WHERE project_id = my_project AND state = idle
>> LIMIT n OFFSET i
>> RETURNING project_id;
>> idlist = @q;
>> commit;
>>
>> foreach x in idlist
>> process record_id
>> begin
>> update queue set state = finsihed where record_id = x;
>> commit;
>>
>> Forgive the part perl part python sudocode. Oh, and I've never done
>> this, no idea if it actually works. :-)
>
> Thanks for that suggestion, I'll take a look at it.
>
> While I hadn't caught on to the "RETURNING" part, I had been wondering
> if using a single step UPDATE might be a solution. One concern I have is
> how concurrent updates will affect the returned list (or if they will
> just be skipped, as SELECT would in normal transaction mode, if I
> understood correctly), or whether it might return with an error code (I
> know that the normal update return value is the number of updated items,
> just not sure if that applies for "RETURNING").
>
> Although, I will note that this process (if it works) will, sort of,
> make FOR UPDATE redundant. Or, if it doesn't, the current lock-policy
> might cause issues for concurrent updates for the use-cases where FOR
> UPDATE is relevant.
>

Yeah, I'd wondered the same thing.  It could be two updates hitting the
same row will deadlock, or maybe not, I'm not sure.  But I think its the
same as with the select, if you happen to have two limits that hit the
same range, you're in trouble.

I think the random limit thing is a race condition itself.  Whenever you
have multiple processes hitting the same rows you're going to run into
problems.  Have you thought of using a sequence instead of a random
limit?  Each process could get the next 100 record_id'd via  a sequence,
then there would be much less chance of deadlock.

-Andy

pgsql-general by date:

Previous
From: "Yngve Nysaeter Pettersen"
Date:
Subject: Re: Select for update with offset interferes with concurrent transactions
Next
From: Tom Lane
Date:
Subject: Re: Select for update with offset interferes with concurrent transactions