Re: Serialization, Locking...implement processing Queue with a table - Mailing list pgsql-general
| From | D. Dante Lorenso | 
|---|---|
| Subject | Re: Serialization, Locking...implement processing Queue with a table | 
| Date | |
| Msg-id | 08b801c319f4$7ff58fa0$1564a8c0@ROMULUS Whole thread Raw | 
| In response to | Serialization, Locking...implement processing Queue with a table ("D. Dante Lorenso" <dante@lorenso.com>) | 
| List | pgsql-general | 
Ah.  Very good, Manfred.  I see what you are saying now.
This will indeed work.  And it works without locking.
I like it.  Hmmm... So, to convert your SQL back into
pseudo-code of my own ... You are saying it works something
like this:
---------- 8< -------------------- 8< ----------
FUNCTION reserve_job
BEGIN
    SELECT and LOCK row_id matching our criteria. (SELECT FOR UPDATE)
    IF (row_id was found) THEN
        RESERVE row_id (UPDATE)
        RETURN (row_id) -- done
    ELSE
        Ask "Are you Sure?" there are no rows matching our criteria?
        IF (certainly no row_id exists) THEN
            RETURN (0) -- no row ID exists
        ELSE
            RETURN reserve_job -- recursive call
        END IF
    END IF
END
---------- 8< -------------------- 8< ----------
OK, I see this SHOULD INDEED work.  So, now my question is
about costs.  Assuming both approaches (test/retry VS lock/wait)
yield a functional result.  Which is best?  Which is most
efficient.  Here are some pros and cons:
LOCK/WAIT
--------------------------------
PROS
    - fewer queries made: only one lock and one select per call
    - easy to implement
    - access to function is granted in order requested (per
      PostgreSQL handling of locks - expectation is that no
      starvation should occur)
CONS
    - requires setting locks in application code
      which could be forgotten by app developers
    - locks are placed outside function blocking entire function
      which may have code that might be safely run concurrently
TEST/RETRY
--------------------------------
PROS
    - in theory, code SHOULD be entirely contained within a
      single stored procedure (less application coding needed)
    - no locks needed (faster execution?)
CONS
    - requires extra queries to determine empty queue
    - may starve if a single process continues to grab the same
      row as other processes
    - need recursive calls in PL/PGSQL?
I am going to try to implement your suggestion and see
what I get in comparison to what I am seeing now with my
LOCK/WAIT code.
Dante
D. Dante Lorenso
dante@direct2prospect.com
972-333-4139
> On Wed, 14 May 2003 01:10:39 -0500, "D. Dante Lorenso"
> <dante@lorenso.com> wrote:
> >> How do you distinguish between (i) and (ii)?  Just do
> >> SELECT job_id FROM job_table
> >>     WHERE processor_id = 0 LIMIT 1;
> >> If this returns 0 rows, you have (i).
> >> If it returns one row, you have (ii).
> >
> >You can't do it this way
>
> Oh.
>
> > because if this select IS successful,
> >you'll still need to LOCK the row for the update.
>
> That's exactly the reason why I told you ...
>
> |(ii)  The row has been reserved by another transaction running at the
> |same time.  In this case, restart at SELECT FOR UPDATE.
>                                    ^^
>                                  not after!
>
> If there is a row satisfying the WHERE clause but SELECT FOR UPDATE
> does not return it, this is the effect of a race condition:
>
> SELECT xmax,* FROM job;
>  xmax | id | pr
> ------+----+----
>     0 |  2 |  0
>     0 |  3 |  1
>     0 |  1 |  2
>     0 |  4 |  0
>
> Session 1                    Session 2
>
> BEGIN;
> SELECT xmax,* FROM job
>  WHERE pr = 0 FOR UPDATE LIMIT 1;
>  xmax | id | pr
> ------+----+----
>     0 |  2 |  0
> (1 row)
>
> select xmax,* FROM job
>  WHERE id = 2;
>   xmax | id | pr
> -------+----+----
>  58634 |  2 |  0
> (1 row)
>                              BEGIN;
>                              select xmax,* FROM job
>                               WHERE pr = 0 LIMIT 1;
>                               xmax  | id | pr
>                              -------+----+----
>                               58634 |  2 |  0
>                              (1 row)
>
>                              SELECT xmax,* FROM job
>                               WHERE pr = 0 FOR UPDATE LIMIT 1;
>                              -- waits, because the row with id=2
>                              -- satisfies the WHERE clause but
>                              -- is locked by transaction 58634 ...
> UPDATE job SET pr = 1
>  WHERE id = 2;
>                              -- The same would happen, if we did the
>                              -- SELECT FOR UPDATE here (after the
>                              -- UPDATE in the other session), because
>                              -- our *visible* version of the row
>                              -- still satisfies the WHERE clause.
>
> select xmax,* FROM job
>  WHERE id = 2;
>  xmax | id | pr
> ------+----+----
>     0 |  2 |  1
> (1 row)
> -- xmax = 0 because we see
> -- the newly inserted tuple
>
> COMMIT;
>                              -- continues ...
>                               xmax | id | pr
>                              ------+----+----
>                              (0 rows)
>                              -- because the row this SELECT was about
>                              -- to return does not satisfy pr = 0
>                              -- any more
>
>                              SELECT xmax,* FROM job
>                               WHERE pr = 0 LIMIT 1;
>                               xmax | id | pr
>                              ------+----+----
>                                  0 |  4 |  0
>                              (1 row)
>                              -- but there is another row, so ...
>
>                              SELECT xmax,* FROM job
>                               WHERE pr = 0 FOR UPDATE LIMIT 1;
>                               xmax | id | pr
>                              ------+----+----
>                                  0 |  4 |  0
>                              (1 row)
>                              -- does not necessarily return the same
>                              -- id as the previous SELECT
>
>                              UPDATE ...;
>                              COMMIT;
>
> HTH.
> Servus
>  Manfred
>
		
	pgsql-general by date: