Re: Select for update with offset interferes with concurrent transactions - Mailing list pgsql-general

From Yngve Nysaeter Pettersen
Subject Re: Select for update with offset interferes with concurrent transactions
Date
Msg-id op.vp8mntpsvqd7e2@killashandra.oslo.osa
Whole thread Raw
In response to Re: Select for update with offset interferes with concurrent transactions  (Andy Colson <andy@squeakycode.net>)
Responses Re: Select for update with offset interferes with concurrent transactions
List pgsql-general
Thanks Andy,

On Tue, 01 Feb 2011 19:29:08 +0100, Andy Colson <andy@squeakycode.net>
wrote:

> On 2/1/2011 12:10 PM, Yngve Nysaeter Pettersen wrote:
>> On Tue, 01 Feb 2011 18:11:23 +0100, Andy Colson <andy@squeakycode.net>
>>> I think the random limit thing is a race condition itself. Whenever
>>> you have multiple processes hitting the same rows you're going to run
>>> into problems. Have you thought of using a sequence instead of a
>>> random limit? Each process could get the next 100 record_id'd via a
>>> sequence, then there would be much less chance of deadlock.
>>
>> How would that work, in case you would like to provide an example?
>>
>> I am not really familiar with sequences, as I have only seen them used
>> for the "id" field in Django generated tables.
>>
>> In case it is relevant, the processes does not (currently, at least)
>> have a unique ID; though they have a local sequence number for the
>> machine they are running on.
>>
>>
>
> I have a really simple q table I use.
>
>      create table q (id integer not null, msg integer, primary key(id));
>      create sequence q_add;
>      create sequence q_read;
>
> I insert via q_add:
>
> andy=# insert into q(id, msg) values(nextval('q_add'), 20);
> INSERT 0 1
> andy=# insert into q(id, msg) values(nextval('q_add'), 4);
> INSERT 0 1
> andy=# select * from q;
>   id | msg
> ----+-----
>    1 |  20
>    2 |   4
> (2 rows)
>
>
> Then I run multiple batch proc's which get their next job like:
>
> andy=# select msg from q where id = (select nextval('q_read'));
>   msg
> -----
>    20
> (1 row)
>
> andy=# select msg from q where id = (select nextval('q_read'));
>   msg
> -----
>     4
> (1 row)
>
>
> It works for me because I can empty the q table, reset the q_add and
> q_read sequences and start over clean.  Not sure if it would work for
> your setup.

I see how that would work (it is essentially how Django assigns row ids).

My current setup can have multiple runs configured at a time (and have had
several dozen queued, in one case), with varying priorities on each run,
and they might, at least theoretically, be configured in parallel (even
the individual runs are set up in parallel), meaning the ids would not be
sequential (a sequence is used for the id field in each row of the table),
unless they could somehow be allocated for each individual run/project
(multiple sequence objects, one for each run might be an option, but I
don't like that possibility). And as I mentioned elsewhere in the thread I
might make the queuing a bit more complex, which might make this system
even more complicated.

So, AFAICT I am afraid it would not work in the general case for my
project :( .
However, it might be useful in somebody else's project :) .

--
Sincerely,
Yngve N. Pettersen
********************************************************************
Senior Developer             Email: yngve@opera.com
Opera Software ASA                   http://www.opera.com/
Phone:  +47 23 69 32 60              Fax:    +47 23 69 24 01
********************************************************************

pgsql-general by date:

Previous
From: Nicos Panayides
Date:
Subject: Re: Weird performance issue with custom function with a for loop.
Next
From: Andy Colson
Date:
Subject: Re: Select for update with offset interferes with concurrent transactions