Thread: Serialization, Locking...implement processing Queue with a table

Serialization, Locking...implement processing Queue with a table

From
"D. Dante Lorenso"
Date:
I want to implement a processing Queue with records
in a table.  This means that I'd like to have multiple
processors performing operations against a PostgreSQL
database but have those processors only operating on
one row each.  Ideally, I'd have a PL/PGSQL function
that uniquely updates (reserves a row for operation by
a given process or thread) and returns only one row
at a time from a table like this:

    int row_id = reserve_next_row(int processor_id);

I'm tripping all over myself with transaction isolation
modes, transactions, pl/pgsql and basic table locks.  The
behavior I expect is not happening.

How should I go about implementing a synchronized process id
queue that will select one unique row from a table at a time
and make the selection be safe with concurrent accesses?

To get more technical with my attempts, I am using Java/JDBC,
PostgreSQL 7.3.2, PL/PGSQL, and have the following code:

---------- 8< -------------------- 8< -------------------- 8< ----------
CREATE OR REPLACE FUNCTION reserve_next_import (bigint) RETURNS bigint AS'
DECLARE
    processor_id ALIAS FOR $1;
    my_import_id BIGINT;
    my_number INTEGER;
    my_import_state CHAR;
BEGIN
    -- Gotta get a handle on this semaphore before you can get in here
    LOCK TABLE import IN EXCLUSIVE MODE;

    my_import_id := -1;

    -- Find the import ID we wish to reserve and get a lock on that row
    SELECT import_id, import_state INTO my_import_id, my_import_state
    FROM import
    WHERE import_state = ''Q''
    AND import_processor_id IS NULL
    ORDER BY import_id
    LIMIT 1;
    --FOR UPDATE; -- set this, and processes hit the NOT FOUND below

    IF NOT FOUND THEN
        RAISE NOTICE ''No Items left in the Queue.'';
        RETURN (-1);
    END IF;

    -- now go reserve the right to process that record
    UPDATE import SET
        import_processor_id = processor_id,
        import_prc_start = NULL,
        import_prc_end = NULL,
        import_state = ''R''
    WHERE import_id = my_import_id;

    -- return the ID for us to process...
    RETURN (my_import_id);
END;
'LANGUAGE 'plpgsql';
---------- 8< -------------------- 8< -------------------- 8< ----------

Ideally, I could call this function from psql or JDBC
and have it block all access to other processes or threads
by calling 'LOCK TABLE import IN EXCLUSIVE MODE' and make
sure that only one process gets inside the function at a
time (like Java's 'synchronized()' function).

Well, problem is that my psql instances are still seeing
different views of the data and even though one process
reserves an ID and updates the state to 'R', the next process
doesn't see the update (if it has already started the
function as is waiting at the lock) and so it will reserve
the same ID in the SELECT.

I attempted to fix this by using SELECT FOR UPDATE on my
select statement, but problem there is that then the SELECT
from the second process fails and a -1 is returned by my
function.

OK, so you see what I want?  I want a function that locks
all access to a table, reserves an ID and then releases the
lock.  I then want any other processes that are waiting for
that lock to immediately see the updated information as they
process the code inside the function.

If I set TRANSACTION ISOLATION level to SERIALIZED in JDBC, then
calls will fail if another process is inside the function already.
I don't want this either.  I want processes to WAIT at the lock
then get in and successfully get their own ID.

How is this Done?  Anybody have an answer for me?  Am I going
about this the right way?

Much help appreciated!

Dante

D. Dante Lorenso
dante@lorenso.com
972-333-4139


"D. Dante Lorenso" <dante@lorenso.com> writes:
> How should I go about implementing a synchronized process id
> queue that will select one unique row from a table at a time
> and make the selection be safe with concurrent accesses?

You can find various discussions of this in the archives, but a
reasonable way to proceed is:

1. The table of pending or in-process jobs has a column "processor_id"
that is zero for pending jobs and equal to the (unique) processor number
for active jobs.  (Assume for the moment that completed jobs are removed
from the table entirely.)

2. When idle, you try to reserve a job like so:

    BEGIN;
    SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
    SELECT job_id, ... FROM job_table
        WHERE processor_id = 0 LIMIT 1 FOR UPDATE;

The SELECT has three possible outcomes:

2a: One row is returned.  You do

    UPDATE job_table SET processor_id = $me
        WHERE job_id = $jobid;
    COMMIT;

and then go about executing the job.  When done, delete the row from
job_table and try to get another one.

2b: No row is returned: no jobs are pending.  Commit your transaction,
sleep for an appropriate delay period, and try again.

2c: You get a "can't serialize" failure.  This will happen if two
processors try to reserve the same row at the same time.  In this case,
roll back your transaction, sleep for a short interval (maybe a few
msec) and try again.  You don't want to sleep as long as normal in this
case, since there might be another available job.

(Note that you *cannot* do this in a plpgsql function, since it cannot
start or commit a transaction; these commands have got to be directly
issued by the application.)


Assuming that there aren't a vast number of pending jobs at any time,
this should work pretty well without even bothering with an index on
job_table.  You will want to vacuum it often though (at least every few
hundred job completions, I'd think).

Now, what if you wanted to remember completed jobs?  I'd actually
recommend transferring the records of completed jobs to a different
table.  But if you really want to keep them in the same table, maybe
add a boolean "completed" field, and make the initial SELECT be

    SELECT job_id, ... FROM job_table
        WHERE processor_id = 0 AND NOT completed LIMIT 1 FOR UPDATE;

Now you *will* need an index to keep things speedy.  I'd try a partial
index on processor_id with condition "NOT completed".  You'll still
need frequent vacuums.

            regards, tom lane


Re: Serialization, Locking...implement processing Queue with a table

From
"D. Dante Lorenso"
Date:
Tom,

Thanks for the excellent reply.  I was tossing solutions
back and forth and came across this one, but I don't like
the idea of failing on a transaction and having to retry
it after a delay, so I've come up with this...

What do you think of my alternative solution:?

In Java, I have a function like this which begins a transaction,
locks a dummy table exclusively, and then runs the stored
procedure to reserve the next record for processing:

---------- 8< -------------------- 8< --------------------
public int reserveQueuedImport(int pid) throws SQLException {
    Connection conn = LeadDBConnection.getConnection();

    // Reserve an import (for processing), and return it's ID.
    PreparedStatement pstmt =
        conn.prepareStatement(
            ""
                + "BEGIN TRANSACTION; "
                + "LOCK TABLE import_lock IN EXCLUSIVE MODE; "
                + "SELECT reserve_next_import(?) AS import_id; "
                + "COMMIT; ");
    pstmt.setInt(1, pid);
    ResultSet rec = pstmt.executeQuery();

    // get the value from the first row and first column
    rec.first();
    return (rec.getInt(1));
}


---------- 8< -------------------- 8< --------------------

Meanwhile, the PL/PGSQL stored procedure looks like this:

---------- 8< -------------------- 8< --------------------
CREATE OR REPLACE FUNCTION reserve_next_import (bigint)
RETURNS bigint AS'
DECLARE
    processor_id ALIAS FOR $1;
    my_import_id BIGINT;
BEGIN
    -- initialize the id
    my_import_id := -1;

    -- Find the import ID we wish to reserve and get a lock on that row
    SELECT import_id INTO my_import_id
    FROM import
    WHERE import_state = ''Q''
    AND import_processor_id IS NULL
    ORDER BY import_id
    LIMIT 1
    FOR UPDATE;

    -- abort if there are no queued rows
    IF NOT FOUND THEN
        RETURN (-1);
    END IF;

    -- now go reserve the record with our processor id
    UPDATE import SET
        import_processor_id = processor_id,
        import_prc_start = NULL,
        import_prc_end = NULL,
        import_state = ''R''
    WHERE import_id = my_import_id;

    -- this is the row we reserved...
    RETURN (my_import_id);
END;
'LANGUAGE 'plpgsql';
---------- 8< -------------------- 8< --------------------

What I've done is used the
'LOCK TABLE import_lock IN EXCLUSIVE MODE;' to create a
'synchronized' block around the code which reserves the
item in the queue.  This way, only one application or
thread can run the PL/PGSQL function at a given time.
There will be BLOCKING for applications that sit at the
LOCK call, but that's more desireable than the Fail/Retry
approach, eh?

Can you confirm that this solution will perform as I expect
while keeping the transaction isolation level at a
READ COMMITTED mode instead of SERIALIZABLE?

Oh, yeah, and as a note, the only purpose for the
'import_lock' table is to provide an object to LOCK on for
this code.  This table is empty and is not used for any other
purpose.  Is there any other 'lighter' objects I can create
or lock on in PostgreSQL than a table like this?

Dante

D. Dante Lorenso
dante@lorenso.com
972-333-4139


----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "D. Dante Lorenso" <dante@lorenso.com>
Cc: <pgsql-general@postgresql.org>
Sent: Monday, May 12, 2003 9:23 AM
Subject: Re: [GENERAL] Serialization, Locking...implement processing Queue
with a table


> "D. Dante Lorenso" <dante@lorenso.com> writes:
> > How should I go about implementing a synchronized process id
> > queue that will select one unique row from a table at a time
> > and make the selection be safe with concurrent accesses?
>
> You can find various discussions of this in the archives, but a
> reasonable way to proceed is:
>
> 1. The table of pending or in-process jobs has a column "processor_id"
> that is zero for pending jobs and equal to the (unique) processor number
> for active jobs.  (Assume for the moment that completed jobs are removed
> from the table entirely.)
>
> 2. When idle, you try to reserve a job like so:
>
> BEGIN;
> SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
> SELECT job_id, ... FROM job_table
>     WHERE processor_id = 0 LIMIT 1 FOR UPDATE;
>
> The SELECT has three possible outcomes:
>
> 2a: One row is returned.  You do
>
> UPDATE job_table SET processor_id = $me
>     WHERE job_id = $jobid;
> COMMIT;
>
> and then go about executing the job.  When done, delete the row from
> job_table and try to get another one.
>
> 2b: No row is returned: no jobs are pending.  Commit your transaction,
> sleep for an appropriate delay period, and try again.
>
> 2c: You get a "can't serialize" failure.  This will happen if two
> processors try to reserve the same row at the same time.  In this case,
> roll back your transaction, sleep for a short interval (maybe a few
> msec) and try again.  You don't want to sleep as long as normal in this
> case, since there might be another available job.
>
> (Note that you *cannot* do this in a plpgsql function, since it cannot
> start or commit a transaction; these commands have got to be directly
> issued by the application.)
>
>
> Assuming that there aren't a vast number of pending jobs at any time,
> this should work pretty well without even bothering with an index on
> job_table.  You will want to vacuum it often though (at least every few
> hundred job completions, I'd think).
>
> Now, what if you wanted to remember completed jobs?  I'd actually
> recommend transferring the records of completed jobs to a different
> table.  But if you really want to keep them in the same table, maybe
> add a boolean "completed" field, and make the initial SELECT be
>
> SELECT job_id, ... FROM job_table
>     WHERE processor_id = 0 AND NOT completed LIMIT 1 FOR UPDATE;
>
> Now you *will* need an index to keep things speedy.  I'd try a partial
> index on processor_id with condition "NOT completed".  You'll still
> need frequent vacuums.
>
> regards, tom lane
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>


Re: Serialization, Locking...implement processing Queue with a table

From
Manfred Koizar
Date:
On Mon, 12 May 2003 14:28:18 -0500, "D. Dante Lorenso"
<dante@lorenso.com> wrote:
>BLOCKING [is] more desireable than the Fail/Retry
>approach, eh?
>
>Can you confirm that this solution will perform as I expect
>while keeping the transaction isolation level at a
>READ COMMITTED mode instead of SERIALIZABLE?

Starting with Tom's suggestion I played around with READ COMMITTED.

>>2. When idle, you try to reserve a job like so:
>>
>>    BEGIN;
    -- TRANSACTION ISOLATION LEVEL is READ COMMITTED by default
>>    SELECT job_id, ... FROM job_table
>>        WHERE processor_id = 0 LIMIT 1 FOR UPDATE;

The SELECT may be blocked for a while and has two possible outcomes:

>>2a: One row is returned.  You do
>>
>>    UPDATE job_table SET processor_id = $me
>>        WHERE job_id = $jobid;
>>    COMMIT;
>>
>>and then go about executing the job.
>>
>>2b: No row is returned:
This can have one of two reasons

(i)  no jobs are pending.  Commit your transaction,
>>sleep for an appropriate delay period, and try again.

(ii)  The row has been reserved by another transaction running at the
same time.  In this case, restart at SELECT FOR UPDATE.  You can stay
in the same transaction.  And you don't need to sleep (this has
already happened while SELECT FOR UPDATE was blocked), unless there
are lots of job processors in which case it might be advisible to
sleep for a short random time.

How do you distinguish between (i) and (ii)?  Just do

    SELECT job_id FROM job_table
        WHERE processor_id = 0 LIMIT 1;

If this returns 0 rows, you have (i).
If it returns one row, you have (ii).

I didn't try, but you should be able to do this in a function.  Your
function looks like a good start, you just have to add a retry loop
and a test for (i) vs. (ii)

>CREATE OR REPLACE FUNCTION reserve_next_import (bigint)
>RETURNS bigint AS'
>DECLARE
>    processor_id ALIAS FOR $1;
>    my_import_id BIGINT;
>BEGIN
>    -- initialize the id
>    my_import_id := -1;
>
-- start of loop here!
>    -- Find the import ID we wish to reserve and get a lock on that row
>    SELECT import_id INTO my_import_id
>    FROM import
>    WHERE import_state = ''Q''
>    AND import_processor_id IS NULL
>    ORDER BY import_id
>    LIMIT 1
>    FOR UPDATE;
>
>    -- abort if there are no queued rows
>    IF NOT FOUND THEN
SELECT ...;  -- without FOR UPDATE
IF NOT FOUND THEN
>        RETURN (-1);
ELSE
    continue at top of loop
END IF;
>    END IF;
>
>    -- now go reserve the record with our processor id
>    UPDATE import SET
>        import_processor_id = processor_id,
>        import_prc_start = NULL,
>        import_prc_end = NULL,
>        import_state = ''R''
>    WHERE import_id = my_import_id;
>
>    -- this is the row we reserved...
>    RETURN (my_import_id);
>END;
>'LANGUAGE 'plpgsql';

Servus
 Manfred


Re: Serialization, Locking...implement processing Queue with a table

From
Manfred Koizar
Date:
On Wed, 14 May 2003 04:36:35 -0500, "D. Dante Lorenso"
<dante@lorenso.com> wrote:
>> FUNCTION reserve_job
>> BEGIN
>>     SELECT and LOCK row_id matching our criteria. (SELECT FOR UPDATE)
>>
>>     IF (row_id was found) THEN
>>         RESERVE row_id (UPDATE)
>>         RETURN (row_id) -- done
>>     ELSE
>>         Ask "Are you Sure?" there are no rows matching our criteria?
>>         IF (certainly no row_id exists) THEN
>>             RETURN (0) -- no row ID exists
>>         ELSE
>>             RETURN reserve_job -- recursive call
>>         END IF
>>     END IF
>> END

What I had in mind was more like

FUNCTION reserve_job
BEGIN
    LOOP
        SELECT ... FOR UPDATE;
        IF (row_id was found) THEN
            RESERVE row_id (UPDATE);
            RETURN (row_id); -- done
        ELSE
            SELECT ...; -- without FOR UPDATE
            IF (certainly no row_id exists) THEN
                RETURN (0) -- no row ID exists
            -- ELSE
                -- continue loop
            END IF;
        END IF;
    END LOOP;
END;

>Well, DAMNIT, this doesn't work.
>
>The problem is that this function does not see the COMMITED
>data from other functions.  I've updated my code to the following:

You're right :-(  I did some more tests and ended up with (note,
however, that I have different names and data types):

CREATE OR REPLACE FUNCTION get_next_job (int) RETURNS int AS '
DECLARE
    in_pid ALIAS FOR $1;
    my_reserved_id int;
BEGIN
    -- Find the ID we wish to reserve and get a lock on that row
    SELECT id INTO my_reserved_id
    FROM job
    WHERE pr = 0
    ORDER BY id
    LIMIT 1
    FOR UPDATE;

    -- abort if there are no queued rows
    IF NOT FOUND THEN
        -- check again ...
        SELECT id INTO my_reserved_id
        FROM job
        WHERE pr = 0
        ORDER BY id
        LIMIT 1;

        IF NOT FOUND THEN
            RETURN (-1);
        ELSE
            RAISE NOTICE ''GOT LOCKED IN RACE [%]'', my_reserved_id;
            RETURN (0);
            -- retry
        END IF;
    ELSE
        -- now go reserve the record with our processor id
        UPDATE job SET pr = in_pid
        WHERE id = my_reserved_id;

        -- this is the row we reserved...
        RETURN (my_reserved_id);
    END IF;
END;
' LANGUAGE 'plpgsql';

So the caller has to cooperate a little:

    while (...) {
        job_id = get_next_job(my_proc_id);
        if (job_id < 0) {
            sleep(...);
        } else if (id == 0) {
            /* retry */
        } else {
            process(job_id);
        }
    }

Make sure that get_next_job() and process() are not called within the
same transaction.

>'LANGUAGE 'plpgsql' STABLE CALLED ON NULL INPUT SECURITY INVOKER;

BTW, I wouldn't call this function STABLE.

You said in your other message:
|PROS
|    - in theory, code SHOULD be entirely contained within a
|      single stored procedure (less application coding needed)
|    - no locks needed (faster execution?)
|CONS
|    - requires extra queries to determine empty queue
|    - may starve if a single process continues to grab the same
|      row as other processes

Unlikely, unless you have really lots of processes.  If two processes
see the same request, only one of them can grab it.  This process will
be busy for a while processing the request, while the other process
will immediately retry and grab the next open request.

|    - need recursive calls in PL/PGSQL?

Better use a loop (in the caller).

IMHO the biggest PRO is:
+ You can put additional conditions into the WHERE clause (e.g.
queue_no = 42) and processors looking for different kinds of requests
will not block each other.

Servus
 Manfred

Re: Serialization, Locking...implement processing Queue with a table

From
"D. Dante Lorenso"
Date:
> ---------- 8< -------------------- 8< ----------
> FUNCTION reserve_job
> BEGIN
>     SELECT and LOCK row_id matching our criteria. (SELECT FOR UPDATE)
>
>     IF (row_id was found) THEN
>         RESERVE row_id (UPDATE)
>         RETURN (row_id) -- done
>     ELSE
>         Ask "Are you Sure?" there are no rows matching our criteria?
>         IF (certainly no row_id exists) THEN
>             RETURN (0) -- no row ID exists
>         ELSE
>             RETURN reserve_job -- recursive call
>         END IF
>     END IF
> END
> I am going to try to implement your suggestion and see
> what I get in comparison to what I am seeing now with my
> LOCK/WAIT code.

Well, DAMNIT, this doesn't work.  The problem goes back
to whatever the heck is happening inside the PL/PGSQL
function?!

The problem is that this function does not see the COMMITED
data from other functions.  I've updated my code to the following:

---------- 8< -------------------- 8< -------------------- 8< ----------
CREATE OR REPLACE FUNCTION "public"."subs_get_next_queued" (bigint) RETURNS
bigint AS'
DECLARE
    in_pid ALIAS FOR $1;
    my_reserved_id BIGINT;
BEGIN
    -- initialize the id
    my_reserved_id := -1;

    -- Find the ID we wish to reserve and get a lock on that row
    SELECT subs_id INTO my_reserved_id
    FROM subscription
    WHERE subs_start_bill_date <= now()
    AND subs_next_bill_date <= now()
    AND subs_processor_id IS NULL
    ORDER BY subs_id ASC
    LIMIT 1
    FOR UPDATE;

    -- abort if there are no queued rows
    IF NOT FOUND THEN
        -- check again to see if there is something I need to process...
        SELECT subs_id INTO my_reserved_id
        FROM subscription
        WHERE subs_start_bill_date <= now()
        AND subs_next_bill_date <= now()
        AND subs_processor_id IS NULL
        ORDER BY subs_id ASC
        LIMIT 1;

        IF NOT FOUND THEN
            RETURN (-1);
        ELSE
            RAISE NOTICE ''GOT LOCKED IN RACE [%]'', my_reserved_id;
            SELECT subs_get_next_queued(in_pid) INTO my_reserved_id;
            RETURN (my_reserved_id);
        END IF;
    END IF;

    -- now go reserve the record with our processor id
    UPDATE subscription SET
        subs_processor_id = in_pid
    WHERE subs_id = my_reserved_id;

    -- this is the row we reserved...
    RETURN (my_reserved_id);
END;
'LANGUAGE 'plpgsql' STABLE CALLED ON NULL INPUT SECURITY INVOKER;
---------- 8< -------------------- 8< -------------------- 8< ----------

Here is what is happening... Process one is started and is humming along
just fine.  Then, I kick process 2 into gear and as soon as a collision
occurs, process 1 spins out of control and goes nuts in an infinite loop.
Apparently, it does not see the changes of process 2 even though they have
already been committed.

PROCESS 1                                  PROCESS 2
 subs_get_next_queued
----------------------
                  126
(1 row)

 subs_get_next_queued
----------------------
                  129
(1 row)

                                            subs_get_next_queued
                                           ----------------------
                                                             140
                                           (1 row)

NOTICE:  GOT LOCKED IN RACE [140]
NOTICE:  GOT LOCKED IN RACE [140]
NOTICE:  GOT LOCKED IN RACE [140]
NOTICE:  GOT LOCKED IN RACE [140]
NOTICE:  GOT LOCKED IN RACE [140]
NOTICE:  GOT LOCKED IN RACE [140]
NOTICE:  GOT LOCKED IN RACE [140]
...
(continues forever and takes down
the PostgreSQL server.  OUCH!)
                                            subs_get_next_queued
                                           ----------------------
                                                             176
                                           (1 row)

                                            subs_get_next_queued
                                           ----------------------
                                                             182
                                           (1 row)

                                           (Continues until PostgreSQL dies
                                           because of other processes
doings)


Can someone please explain what is happening to the PL/PGSQL function
in regards to commit visibility and transactions?

Dante

D. Dante Lorenso
dante@lorenso.com
972-333-4139




Re: Serialization, Locking...implement processing Queue with a table

From
"D. Dante Lorenso"
Date:
> (i)  no jobs are pending.  Commit your transaction,
> >>sleep for an appropriate delay period, and try again.
> (ii)  The row has been reserved by another transaction running at the
> same time.  In this case, restart at SELECT FOR UPDATE.  You can stay
> in the same transaction.  And you don't need to sleep (this has
> already happened while SELECT FOR UPDATE was blocked), unless there
> are lots of job processors in which case it might be advisible to
> sleep for a short random time.
>
> How do you distinguish between (i) and (ii)?  Just do
> SELECT job_id FROM job_table
>     WHERE processor_id = 0 LIMIT 1;
> If this returns 0 rows, you have (i).
> If it returns one row, you have (ii).

You can't do it this way because if this select IS successful,
you'll still need to LOCK the row for the update.  Any way you
try, you're in a race condition whereby another running thread
may lock and/or update that row after you select.  Remember, we
are in READ COMMITTED mode.  Without a lock on the row or a table,
your chances are that the row will be updated out from under you.

Here is a process that I've come up with that locks a bogus table
for me to guarantee that only one process has the right to enter
the stored procedure:

---------- 8< --------------------
BEGIN TRANSACTION;
LOCK TABLE proc_lock IN EXCLUSIVE MODE;
SELECT proc($process_id) AS RESULT;
COMMIT;
---------- 8< --------------------

If I wrap my stored procedure calls with a call like this, I can
guarantee that only one process gets INSIDE the procedure at a
time.  The "proc_lock" table can be any table and for my cases,
I use a table named the same as the table I'm going to be operating
on with _lock added to the end.  This way I don't have to actually
lock a table which is working for other selects by other apps that
don't care about concurrency locks for queue processing.

Now, my stored procedure can do any logic it wants and doesn't
have to worry about concurrency.  I tried to put the 'LOCK TABLE'
statement as the first line in the PL/PGSQL procedure, but it
didn't seem to work there.

    ex:

---------- 8< --------------------
CREATE OR REPLACE FUNCTION reserve_next_queued (bigint)
RETURNS bigint AS'
DECLARE
    processor_id ALIAS FOR $1;
    my_reserved_id BIGINT;
BEGIN
    LOCK TABLE proc_lock IN EXCLUSIVE MODE;
    ...
    RETURN (my_reserved_id);
END;
'LANGUAGE 'plpgsql';
---------- 8< --------------------

Apparently locks work in the stored proc but READ COMMITTED
does not?  For me, it looks like once the procedure has been
entered postgres taken a snapshot of the state of the database
BEFORE the procedure was called.  So, the lock blocks as it
should, but the subsequent select will see the OLD state and
not the updated COMMIT from the rows that were just updated by
the other thread that held the lock.

In other words, it looks like THIS is happening with
app (1) and app (2) trying the same stored proc call:

    (1) begin transaction
    (2) begin transaction
    (1) call PL/PGSQL function
    (2) call PL/PGSQL function
    (1) LOCK table <- no wait, lock granted immediately
    (2) LOCK table <- lock already held BLOCKED
    (1) process  PL/PGSQL function
    (1) exit PL/PGSQL function
    (1) UNLOCK table
    (2) LOCK table <- lock granted
    (2) process  PL/PGSQL function
    (1) COMMIT;
    (2) exit PL/PGSQL function
    (2) COMMIT;

Is that possible?  Can it bee that the lock is being freed
before the COMMIT is called?  Or is the procedure somehow
not allowing my other apps to see the committed data
immediately?

Pulling this lock out of the PL/PGSQL function to the calling
transaction wrapper worked, though.

In the future, I'd like to see PL/PGSQL allow me to embed
transactions stuff like that.  For now, I think my current
solution is clean enough to use without having to write try/fail
routines.  This method trys, waits, then succeeds.  It'd have to
WAIT that long anyhow, but this this approach, the wait is handled
but PostgreSQL and not my app.

I'm interested in knowing if anyone sees a FLAW with this design.
From my tests, I have code that works exactly as designed but not
as cleanly as I'd like.  What are your thoughts?

Dante

D. Dante Lorenso
dante@direct2prospect.com
972-333-4139



Re: Serialization, Locking...implement processing Queue with a table

From
Manfred Koizar
Date:
On Wed, 14 May 2003 01:10:39 -0500, "D. Dante Lorenso"
<dante@lorenso.com> wrote:
>> How do you distinguish between (i) and (ii)?  Just do
>> SELECT job_id FROM job_table
>>     WHERE processor_id = 0 LIMIT 1;
>> If this returns 0 rows, you have (i).
>> If it returns one row, you have (ii).
>
>You can't do it this way

Oh.

> because if this select IS successful,
>you'll still need to LOCK the row for the update.

That's exactly the reason why I told you ...

|(ii)  The row has been reserved by another transaction running at the
|same time.  In this case, restart at SELECT FOR UPDATE.
                                   ^^
                                 not after!

If there is a row satisfying the WHERE clause but SELECT FOR UPDATE
does not return it, this is the effect of a race condition:

SELECT xmax,* FROM job;
 xmax | id | pr
------+----+----
    0 |  2 |  0
    0 |  3 |  1
    0 |  1 |  2
    0 |  4 |  0

Session 1                    Session 2

BEGIN;
SELECT xmax,* FROM job
 WHERE pr = 0 FOR UPDATE LIMIT 1;
 xmax | id | pr
------+----+----
    0 |  2 |  0
(1 row)

select xmax,* FROM job
 WHERE id = 2;
  xmax | id | pr
-------+----+----
 58634 |  2 |  0
(1 row)
                             BEGIN;
                             select xmax,* FROM job
                              WHERE pr = 0 LIMIT 1;
                              xmax  | id | pr
                             -------+----+----
                              58634 |  2 |  0
                             (1 row)

                             SELECT xmax,* FROM job
                              WHERE pr = 0 FOR UPDATE LIMIT 1;
                             -- waits, because the row with id=2
                             -- satisfies the WHERE clause but
                             -- is locked by transaction 58634 ...
UPDATE job SET pr = 1
 WHERE id = 2;
                             -- The same would happen, if we did the
                             -- SELECT FOR UPDATE here (after the
                             -- UPDATE in the other session), because
                             -- our *visible* version of the row
                             -- still satisfies the WHERE clause.

select xmax,* FROM job
 WHERE id = 2;
 xmax | id | pr
------+----+----
    0 |  2 |  1
(1 row)
-- xmax = 0 because we see
-- the newly inserted tuple

COMMIT;
                             -- continues ...
                              xmax | id | pr
                             ------+----+----
                             (0 rows)
                             -- because the row this SELECT was about
                             -- to return does not satisfy pr = 0
                             -- any more

                             SELECT xmax,* FROM job
                              WHERE pr = 0 LIMIT 1;
                              xmax | id | pr
                             ------+----+----
                                 0 |  4 |  0
                             (1 row)
                             -- but there is another row, so ...

                             SELECT xmax,* FROM job
                              WHERE pr = 0 FOR UPDATE LIMIT 1;
                              xmax | id | pr
                             ------+----+----
                                 0 |  4 |  0
                             (1 row)
                             -- does not necessarily return the same
                             -- id as the previous SELECT

                             UPDATE ...;
                             COMMIT;

HTH.
Servus
 Manfred

"D. Dante Lorenso" <dante@lorenso.com> writes:
> You can't do it this way because if this select IS successful,
> you'll still need to LOCK the row for the update.

That's why you should be using SELECT FOR UPDATE.

            regards, tom lane

Re: Serialization, Locking...implement processing Queue with a table

From
"D. Dante Lorenso"
Date:
Ah.  Very good, Manfred.  I see what you are saying now.
This will indeed work.  And it works without locking.
I like it.  Hmmm... So, to convert your SQL back into
pseudo-code of my own ... You are saying it works something
like this:

---------- 8< -------------------- 8< ----------
FUNCTION reserve_job
BEGIN
    SELECT and LOCK row_id matching our criteria. (SELECT FOR UPDATE)

    IF (row_id was found) THEN
        RESERVE row_id (UPDATE)
        RETURN (row_id) -- done
    ELSE
        Ask "Are you Sure?" there are no rows matching our criteria?
        IF (certainly no row_id exists) THEN
            RETURN (0) -- no row ID exists
        ELSE
            RETURN reserve_job -- recursive call
        END IF
    END IF
END

---------- 8< -------------------- 8< ----------

OK, I see this SHOULD INDEED work.  So, now my question is
about costs.  Assuming both approaches (test/retry VS lock/wait)
yield a functional result.  Which is best?  Which is most
efficient.  Here are some pros and cons:

LOCK/WAIT
--------------------------------
PROS
    - fewer queries made: only one lock and one select per call
    - easy to implement
    - access to function is granted in order requested (per
      PostgreSQL handling of locks - expectation is that no
      starvation should occur)
CONS
    - requires setting locks in application code
      which could be forgotten by app developers
    - locks are placed outside function blocking entire function
      which may have code that might be safely run concurrently

TEST/RETRY
--------------------------------
PROS
    - in theory, code SHOULD be entirely contained within a
      single stored procedure (less application coding needed)
    - no locks needed (faster execution?)
CONS
    - requires extra queries to determine empty queue
    - may starve if a single process continues to grab the same
      row as other processes
    - need recursive calls in PL/PGSQL?

I am going to try to implement your suggestion and see
what I get in comparison to what I am seeing now with my
LOCK/WAIT code.

Dante

D. Dante Lorenso
dante@direct2prospect.com
972-333-4139



> On Wed, 14 May 2003 01:10:39 -0500, "D. Dante Lorenso"
> <dante@lorenso.com> wrote:
> >> How do you distinguish between (i) and (ii)?  Just do
> >> SELECT job_id FROM job_table
> >>     WHERE processor_id = 0 LIMIT 1;
> >> If this returns 0 rows, you have (i).
> >> If it returns one row, you have (ii).
> >
> >You can't do it this way
>
> Oh.
>
> > because if this select IS successful,
> >you'll still need to LOCK the row for the update.
>
> That's exactly the reason why I told you ...
>
> |(ii)  The row has been reserved by another transaction running at the
> |same time.  In this case, restart at SELECT FOR UPDATE.
>                                    ^^
>                                  not after!
>
> If there is a row satisfying the WHERE clause but SELECT FOR UPDATE
> does not return it, this is the effect of a race condition:
>
> SELECT xmax,* FROM job;
>  xmax | id | pr
> ------+----+----
>     0 |  2 |  0
>     0 |  3 |  1
>     0 |  1 |  2
>     0 |  4 |  0
>
> Session 1                    Session 2
>
> BEGIN;
> SELECT xmax,* FROM job
>  WHERE pr = 0 FOR UPDATE LIMIT 1;
>  xmax | id | pr
> ------+----+----
>     0 |  2 |  0
> (1 row)
>
> select xmax,* FROM job
>  WHERE id = 2;
>   xmax | id | pr
> -------+----+----
>  58634 |  2 |  0
> (1 row)
>                              BEGIN;
>                              select xmax,* FROM job
>                               WHERE pr = 0 LIMIT 1;
>                               xmax  | id | pr
>                              -------+----+----
>                               58634 |  2 |  0
>                              (1 row)
>
>                              SELECT xmax,* FROM job
>                               WHERE pr = 0 FOR UPDATE LIMIT 1;
>                              -- waits, because the row with id=2
>                              -- satisfies the WHERE clause but
>                              -- is locked by transaction 58634 ...
> UPDATE job SET pr = 1
>  WHERE id = 2;
>                              -- The same would happen, if we did the
>                              -- SELECT FOR UPDATE here (after the
>                              -- UPDATE in the other session), because
>                              -- our *visible* version of the row
>                              -- still satisfies the WHERE clause.
>
> select xmax,* FROM job
>  WHERE id = 2;
>  xmax | id | pr
> ------+----+----
>     0 |  2 |  1
> (1 row)
> -- xmax = 0 because we see
> -- the newly inserted tuple
>
> COMMIT;
>                              -- continues ...
>                               xmax | id | pr
>                              ------+----+----
>                              (0 rows)
>                              -- because the row this SELECT was about
>                              -- to return does not satisfy pr = 0
>                              -- any more
>
>                              SELECT xmax,* FROM job
>                               WHERE pr = 0 LIMIT 1;
>                               xmax | id | pr
>                              ------+----+----
>                                  0 |  4 |  0
>                              (1 row)
>                              -- but there is another row, so ...
>
>                              SELECT xmax,* FROM job
>                               WHERE pr = 0 FOR UPDATE LIMIT 1;
>                               xmax | id | pr
>                              ------+----+----
>                                  0 |  4 |  0
>                              (1 row)
>                              -- does not necessarily return the same
>                              -- id as the previous SELECT
>
>                              UPDATE ...;
>                              COMMIT;
>
> HTH.
> Servus
>  Manfred
>


"D. Dante Lorenso" <dante@lorenso.com> writes:
> Can someone please explain what is happening to the PL/PGSQL function
> in regards to commit visibility and transactions?

Since the function is executed as part of a single client SELECT
command, the whole thing runs with a single snapshot; its view of
other process' actions on the database is frozen at the start of that
outer SELECT, even in READ COMMITTED mode.

There have been past discussions about whether this is a good idea
or not, but so far no consensus to change it, IIRC.  (Look for
"SetQuerySnapshot" in the archives, if you care.)

But at any rate, because of that behavior it's really impossible to
encapsulate the locking behavior entirely in a plpgsql function; the
wait-and-retry loop *must* be on the client side.  That leaves me
doubting that there's much value in trying to encapsulate any of it.
As Manfred noted, you may eventually want to add additional conditions
to the SELECT, and that's just lots easier to do if the logic is all
in one place rather than split between client and server code.

I still like my serializable-mode solution better than any of the
alternatives proposed so far.  Manfred's latest try doesn't solve
the race-condition problem, because the second SELECT will never
find anything the first one didn't find.  (It could if it were on
the client side... but you don't need two selects at all with the
serializable-mode solution.)

            regards, tom lane

Re: Serialization, Locking...implement processing Queue with a table

From
"D. Dante Lorenso"
Date:
> Date: Sun, 16 Jun 2002 19:53:15 -0400
> So I've come around to agree with the position that Tatsuo and Hiroshi
> put forward in the thread mentioned above: plpgsql (and the other PL
> languages) need to do SetQuerySnapshot not only CommandCounterIncrement
> between user-supplied queries.
>
> Is anyone still unconvinced?  If not, I'll try to fix it sometime soon.
>
> As that thread pointed out, there also seem to be some problems with
> plpgsql not doing enough CommandCounterIncrements when it's executing
> already-planned queries; I'll take a look at that issue at the same
> time.
> regards, tom lane

Does this mean that you will be making changes to support
READ COMMITTED mode acting like READ COMMITTED mode when inside
PL/PGSQL procedures? ... or am I pulling up an archive out of place?

I feel that my PL/PGSQL procedures are acting like they are pinned
inside a 'READ REPEATABLE' mode instead of READ COMMITTED.  I'd prefer
to always have the transaction isolation level set to what I ask for
rather than this default behavior.

It seriously cripples the power of the PL/PGSQL language to have the
feature NOT operate this way.  Neat thing about stored procedures is
being able to embed common logic inside the database and ensure consistency
in implementation across multiple client apps, client languages, as well
as minimize client/server traffic.  If you still do not have concensus,
maybe someone just needs to ask for a new poll?

Dante

D. Dante Lorenso
dante@lorenso.com
972-333-4139


----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "D. Dante Lorenso" <dante@lorenso.com>
Cc: "Manfred Koizar" <mkoi-pg@aon.at>; <pgsql-general@postgresql.org>
Sent: Wednesday, May 14, 2003 9:53 AM
Subject: Re: [GENERAL] Serialization, Locking...implement processing Queue
with a table


> "D. Dante Lorenso" <dante@lorenso.com> writes:
> > Can someone please explain what is happening to the PL/PGSQL function
> > in regards to commit visibility and transactions?
>
> Since the function is executed as part of a single client SELECT
> command, the whole thing runs with a single snapshot; its view of
> other process' actions on the database is frozen at the start of that
> outer SELECT, even in READ COMMITTED mode.
>
> There have been past discussions about whether this is a good idea
> or not, but so far no consensus to change it, IIRC.  (Look for
> "SetQuerySnapshot" in the archives, if you care.)
>
> But at any rate, because of that behavior it's really impossible to
> encapsulate the locking behavior entirely in a plpgsql function; the
> wait-and-retry loop *must* be on the client side.  That leaves me
> doubting that there's much value in trying to encapsulate any of it.
> As Manfred noted, you may eventually want to add additional conditions
> to the SELECT, and that's just lots easier to do if the logic is all
> in one place rather than split between client and server code.
>
> I still like my serializable-mode solution better than any of the
> alternatives proposed so far.  Manfred's latest try doesn't solve
> the race-condition problem, because the second SELECT will never
> find anything the first one didn't find.  (It could if it were on
> the client side... but you don't need two selects at all with the
> serializable-mode solution.)
>
> regards, tom lane
>



SetQuerySnapshot inside PL/PGSQL ?

From
"D. Dante Lorenso"
Date:
I really want to see newly committed data when I am INSIDE
a PL/PGSQL function.  This way, if I do a LOCK on a table
and subsequently perform a SELECT after aquiring the lock,
I know I'll see the data that any previous transactions just
updated (hence they released my lock)...

So, if you don't want to do the auto 'SetQuerySnapshot' calls
as they probably should be, is there a way I can call this from
the PL/PGSQL function myself, directly?  Something like this:

-------------------- 8< --------------------
CREATE OR REPLACE...
...
BEGIN
    LOCK TABLE blah_blah... EXCLUSIVE MODE;
    SetQuerySnapshot();  <----------- I want something like this

    SELECT id
    FROM table
    WHERE blah blah blah ...
    FOR UPDATE;

    IF NOT FOUND THEN
        ...
    ELSE
        ...
    END IF;
END;
...
-------------------- 8< --------------------

Something like that might solve my problem whereby I'd like the
PL/pgSQL function to be able to work as if it were in READ COMMITTED
mode.

Dante

D. Dante Lorenso
dante@lorenso.com
972-333-4139