Re: Assigning data-entry tasks to multiple concurrent clients - Mailing list pgsql-sql

From Scott Marlowe
Subject Re: Assigning data-entry tasks to multiple concurrent clients
Date
Msg-id dcc563d10906010255g63f48dds15ab4359c3a1be42@mail.gmail.com
Whole thread Raw
In response to Assigning data-entry tasks to multiple concurrent clients  (Jamie Tufnell <diesql@googlemail.com>)
List pgsql-sql
On Sun, May 31, 2009 at 9:54 PM, Jamie Tufnell <diesql@googlemail.com> wrote:
> BEGIN;
> SELECT * FROM records
> WHERE in_edit_queue AND id NOT IN (
>   SELECT record_id FROM locked_records
>   WHERE locked_since < now() + interval '5 minutes')
> LIMIT 1;
>
> INSERT INTO locked_records (record_id, locked_since) VALUES (?, now());
> COMMIT;

There's a race condition here but a unique constraint on record_id
will take care of that, as long as you catch the error and retry.

> Then to save (first-in wins is acceptable for this environment):
>
> BEGIN;
> UPDATE records SET in_edit_queue = false WHERE id = ? AND in_edit_queue =
> true;
> DELETE FROM locked_records WHERE record_id = ?;
> COMMIT;
>
> Is this a sane approach?  Is there a better way to do this with PostgreSQL?

It'll work.  The key to any kind of system like this is monitoring the
progress for things that get stuck / fail to be processed and running
them a second time if need be.  I had a system to process 1M rows at a
time from an 880M row db, and I used a secondary sequence and recid/1M
to partition it out.  So, the next job up grabs a sequence id from t
secondary sequence, which matches the record(or set) to be processed.
With that method there's no locking or anything needed, and no one
needs to "check out" the records, because incrementing the secindary
sequence is in fact checking them out.  Just check the finished table
to see if there's any holes and if there are put those jobs back in
the queue by simply updating their id to the next value for the
porimary id sequence.

Sequences can be an elegant way of assigning jobs to multiple threads
without locking issues.


pgsql-sql by date:

Previous
From: Jasen Betts
Date:
Subject: Re: Assigning data-entry tasks to multiple concurrent clients
Next
From: Bryce Nesbitt
Date:
Subject: Avoiding "will create implicit index" NOTICE