Re: Select for update with offset interferes with concurrent transactions - Mailing list pgsql-general

From Andy Colson
Subject Re: Select for update with offset interferes with concurrent transactions
Date
Msg-id 4D48593F.6050107@squeakycode.net
Whole thread Raw
In response to Re: Select for update with offset interferes with concurrent transactions  ("Yngve Nysaeter Pettersen" <yngve@opera.com>)
Responses Re: Select for update with offset interferes with concurrent transactions
List pgsql-general
On 2/1/2011 12:51 PM, Yngve Nysaeter Pettersen wrote:
>
> Thanks Andy,
>
> On Tue, 01 Feb 2011 19:29:08 +0100, Andy Colson <andy@squeakycode.net>
> wrote:
>
>> On 2/1/2011 12:10 PM, Yngve Nysaeter Pettersen wrote:
>>> On Tue, 01 Feb 2011 18:11:23 +0100, Andy Colson <andy@squeakycode.net>
>>>> I think the random limit thing is a race condition itself. Whenever
>>>> you have multiple processes hitting the same rows you're going to run
>>>> into problems. Have you thought of using a sequence instead of a
>>>> random limit? Each process could get the next 100 record_id'd via a
>>>> sequence, then there would be much less chance of deadlock.
>>>
>>> How would that work, in case you would like to provide an example?
>>>
>>> I am not really familiar with sequences, as I have only seen them used
>>> for the "id" field in Django generated tables.
>>>
>>> In case it is relevant, the processes does not (currently, at least)
>>> have a unique ID; though they have a local sequence number for the
>>> machine they are running on.
>>>
>>>
>>
>> I have a really simple q table I use.
>>
>> create table q (id integer not null, msg integer, primary key(id));
>> create sequence q_add;
>> create sequence q_read;
>>
>> I insert via q_add:
>>
>> andy=# insert into q(id, msg) values(nextval('q_add'), 20);
>> INSERT 0 1
>> andy=# insert into q(id, msg) values(nextval('q_add'), 4);
>> INSERT 0 1
>> andy=# select * from q;
>> id | msg
>> ----+-----
>> 1 | 20
>> 2 | 4
>> (2 rows)
>>
>>
>> Then I run multiple batch proc's which get their next job like:
>>
>> andy=# select msg from q where id = (select nextval('q_read'));
>> msg
>> -----
>> 20
>> (1 row)
>>
>> andy=# select msg from q where id = (select nextval('q_read'));
>> msg
>> -----
>> 4
>> (1 row)
>>
>>
>> It works for me because I can empty the q table, reset the q_add and
>> q_read sequences and start over clean. Not sure if it would work for
>> your setup.
>
> I see how that would work (it is essentially how Django assigns row ids).
>
> My current setup can have multiple runs configured at a time (and have had
> several dozen queued, in one case), with varying priorities on each run,
> and they might, at least theoretically, be configured in parallel (even
> the individual runs are set up in parallel), meaning the ids would not be
> sequential (a sequence is used for the id field in each row of the table),
> unless they could somehow be allocated for each individual run/project
> (multiple sequence objects, one for each run might be an option, but I
> don't like that possibility). And as I mentioned elsewhere in the thread I
> might make the queuing a bit more complex, which might make this system
> even more complicated.
>
> So, AFAICT I am afraid it would not work in the general case for my
> project :( .
> However, it might be useful in somebody else's project :) .
>

No, I didn't think it would work for you, yours looks much more
complicated than main.  Just out of curiosity, have you looked at PgQ?

-Andy

pgsql-general by date:

Previous
From: "Yngve Nysaeter Pettersen"
Date:
Subject: Re: Select for update with offset interferes with concurrent transactions
Next
From: Radosław Smogura
Date:
Subject: Re: Select for update with offset interferes with concurrent transactions