Thread: Is "WITH () UPDATE" Thread Safe ?

Is "WITH () UPDATE" Thread Safe ?

From
Paul GOERGLER
Date:
Hello,


I have a lot of tickets, i need to take a batch of tickets and process them.
So the process is :
SELECT ONLY 100 tickets
PROCESS ticket
MARK THEM AS « done »

I’m selecting the tickets with :

WITH t0 AS (
    SELECT t.id, 
RANDOM() AS rank,
EXTRACT(EPOCH FROM NOW() - b.created_at) as elapsed
    FROM tickets AS t
    LEFT JOIN batch as b ON b.id = t.batch_id
    WHERE (
        t.status = 'waiting' OR
        (t.status = 'processing' AND t.locked_until IS NOT NULL AND t.locked_until <= NOW())
    ) AND t.send_at < NOW()
    AND (t.send_before IS NULL OR t.send_before > NOW())
    ORDER BY 
        t.priority DESC,
        rank ASC
    LIMIT 100
    FOR UPDATE OF t
)
UPDATE tickets AS t1
SET status = 'processing',
    locked_until = NOW() + '1 HOUR’,
    extra = t1.extra || hstore('elapsed', t0.elapsed || '') || hstore('rank', rank || '')
FROM t0
WHERE t1.id = t0.id
RETURNING t1.*;


I wonder if this query is thread safe, Can a ticket be updated between the SELECT part (t0) and the UPDATE part ?
If this query is not « thread safe » how can i do this ?

-- 
Paul

Re: Is "WITH () UPDATE" Thread Safe ?

From
Albe Laurenz
Date:
Paul GOERGLER wrote:
> I have a lot of tickets, i need to take a batch of tickets and process them.
> So the process is :
> SELECT ONLY 100 tickets
> PROCESS ticket
> MARK THEM AS « done »
> 
> I’m selecting the tickets with :
> 
> WITH t0 AS (
>     SELECT t.id,
> RANDOM() AS rank,
> EXTRACT(EPOCH FROM NOW() - b.created_at) as elapsed
>     FROM tickets AS t
>     LEFT JOIN batch as b ON b.id = t.batch_id
>     WHERE (
>         t.status = 'waiting' OR
>         (t.status = 'processing' AND t.locked_until IS NOT NULL AND t.locked_until <= NOW())
>     ) AND t.send_at < NOW()
>     AND (t.send_before IS NULL OR t.send_before > NOW())
>     ORDER BY
>         t.priority DESC,
>         rank ASC
>     LIMIT 100
>     FOR UPDATE OF t
> )
> UPDATE tickets AS t1
> SET status = 'processing',
>     locked_until = NOW() + '1 HOUR’,
>     extra = t1.extra || hstore('elapsed', t0.elapsed || '') || hstore('rank', rank || '')
> FROM t0
> WHERE t1.id = t0.id
> RETURNING t1.*;
> 
> 
> I wonder if this query is thread safe, Can a ticket be updated between the SELECT part (t0) and the
> UPDATE part ?
> If this query is not « thread safe » how can i do this ?

There is no race condition in your query because you used SELECT ... FOR UPDATE.

That causes the rows found in the WITH clause to be locked against concurrent modification.

So you should be fine.

Yours,
Laurenz Albe

Re: Is "WITH () UPDATE" Thread Safe ?

From
Paul GOERGLER
Date:
Ok thanks.
-- 
Paul GOERGLER

De: Albe Laurenz <laurenz.albe@wien.gv.at>
Répondre: Albe Laurenz <laurenz.albe@wien.gv.at>>
Date: 1 décembre 2014 at 13:21:07
À: Paul GOERGLER *EXTERN* <pgoergler@gmail.com>>, pgsql-general@postgresql.org <pgsql-general@postgresql.org>>
Sujet:  RE: [GENERAL] Is "WITH () UPDATE" Thread Safe ?

Paul GOERGLER wrote:
> I have a lot of tickets, i need to take a batch of tickets and process them.
> So the process is :
> SELECT ONLY 100 tickets
> PROCESS ticket
> MARK THEM AS « done »
>
> I’m selecting the tickets with :
>
> WITH t0 AS (
> SELECT t.id,
> RANDOM() AS rank,
> EXTRACT(EPOCH FROM NOW() - b.created_at) as elapsed
> FROM tickets AS t
> LEFT JOIN batch as b ON b.id = t.batch_id
> WHERE (
> t.status = 'waiting' OR
> (t.status = 'processing' AND t.locked_until IS NOT NULL AND t.locked_until <= NOW())
> ) AND t.send_at < NOW()
> AND (t.send_before IS NULL OR t.send_before > NOW())
> ORDER BY
> t.priority DESC,
> rank ASC
> LIMIT 100
> FOR UPDATE OF t
> )
> UPDATE tickets AS t1
> SET status = 'processing',
> locked_until = NOW() + '1 HOUR’,
> extra = t1.extra || hstore('elapsed', t0.elapsed || '') || hstore('rank', rank || '')
> FROM t0
> WHERE t1.id = t0.id
> RETURNING t1.*;
>
>
> I wonder if this query is thread safe, Can a ticket be updated between the SELECT part (t0) and the
> UPDATE part ?
> If this query is not « thread safe » how can i do this ?

There is no race condition in your query because you used SELECT ... FOR UPDATE.

That causes the rows found in the WITH clause to be locked against concurrent modification.

So you should be fine.

Yours,
Laurenz Albe

Re: Is "WITH () UPDATE" Thread Safe ?

From
David G Johnston
Date:
Albe Laurenz *EXTERN* wrote
> Paul GOERGLER wrote:
>> I have a lot of tickets, i need to take a batch of tickets and process
>> them.
>> So the process is :
>> SELECT ONLY 100 tickets
>> PROCESS ticket
>> MARK THEM AS « done »
>>
>> I’m selecting the tickets with :
>>
>> WITH t0 AS (
>>     SELECT t.id,
>> RANDOM() AS rank,
>> EXTRACT(EPOCH FROM NOW() - b.created_at) as elapsed
>>     FROM tickets AS t
>>     LEFT JOIN batch as b ON b.id = t.batch_id
>>     WHERE (
>>         t.status = 'waiting' OR
>>         (t.status = 'processing' AND t.locked_until IS NOT NULL AND
>> t.locked_until <= NOW())
>>     ) AND t.send_at < NOW()
>>     AND (t.send_before IS NULL OR t.send_before > NOW())
>>     ORDER BY
>>         t.priority DESC,
>>         rank ASC
>>     LIMIT 100
>>     FOR UPDATE OF t
>> )
>> UPDATE tickets AS t1
>> SET status = 'processing',
>>     locked_until = NOW() + '1 HOUR’,
>>     extra = t1.extra || hstore('elapsed', t0.elapsed || '') ||
>> hstore('rank', rank || '')
>> FROM t0
>> WHERE t1.id = t0.id
>> RETURNING t1.*;
>>
>>
>> I wonder if this query is thread safe, Can a ticket be updated between
>> the SELECT part (t0) and the
>> UPDATE part ?
>> If this query is not « thread safe » how can i do this ?
>
> There is no race condition in your query because you used SELECT ... FOR
> UPDATE.
>
> That causes the rows found in the WITH clause to be locked against
> concurrent modification.
>
> So you should be fine.
>
> Yours,
> Laurenz Albe

I was under the impression that the presence of FOR UPDATE in this situation
was unnecessary since the execution of the update occurs in the same
statement as the select and thus the relevant data will be locked at
execution.

The FOR UPDATE is for situations where other code needs to intervene between
the select and a subsequent update.

The documentation is silent on this distinction, but...

Note that the use of a CTE in this example is a convenience and that the top
level command is still UPDATE, not SELECT.

It may be worthwhile to update the UPDATE page's WITH commentary to note
that (if correct) there is no need for a FOR UPDATE clause on the contained
subquery (yes, that was quite a mouthful...)

David J.





--
View this message in context: http://postgresql.nabble.com/Is-WITH-UPDATE-Thread-Safe-tp5828738p5829038.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Is "WITH () UPDATE" Thread Safe ?

From
Albe Laurenz
Date:
David G Johnston wrote:
> Albe Laurenz *EXTERN* wrote
>> Paul GOERGLER wrote:
>>> I have a lot of tickets, i need to take a batch of tickets and process
>>> them.
>>> So the process is :
>>> SELECT ONLY 100 tickets
>>> PROCESS ticket
>>> MARK THEM AS « done »
>>>
>>> I’m selecting the tickets with :
>>>
>>> WITH t0 AS (
>>>     SELECT t.id,
[...]
>>>     FROM tickets AS t
[...]
>>>     FOR UPDATE OF t
>>> )
>>> UPDATE tickets AS t1
[...]
>>> FROM t0
[...]
>>>
>>>
>>> I wonder if this query is thread safe, Can a ticket be updated between
>>> the SELECT part (t0) and the
>>> UPDATE part ?
>>> If this query is not « thread safe » how can i do this ?

>> There is no race condition in your query because you used SELECT ... FOR
>> UPDATE.
>>
>> That causes the rows found in the WITH clause to be locked against
>> concurrent modification.
>>
>> So you should be fine.

> I was under the impression that the presence of FOR UPDATE in this situation
> was unnecessary since the execution of the update occurs in the same
> statement as the select and thus the relevant data will be locked at
> execution.
> 
> The FOR UPDATE is for situations where other code needs to intervene between
> the select and a subsequent update.
> 
> The documentation is silent on this distinction, but...
> 
> Note that the use of a CTE in this example is a convenience and that the top
> level command is still UPDATE, not SELECT.
> 
> It may be worthwhile to update the UPDATE page's WITH commentary to note
> that (if correct) there is no need for a FOR UPDATE clause on the contained
> subquery (yes, that was quite a mouthful...)

Well, the SELECT and the UPDATE are part of the same statement, so they
run in the same transaction and with the same snapshot.

But it could happen that (for example) two of the above queries are running
concurrently.  Then both could find the same row in the SELECT and modify
it (because of the exclusive row lock, one of them will have to wait until
the other is done).  Then this ticket will be processed twice.

With FOR UPDATE, no two SELECT statements can find the same row, because
by the time the lock taken by the first SELECT is released, the status of the
ticket will have changed and the second SELECT will not find it.

Yours,
Laurenz Albe