Thread: Actual row order in UPDATE and SELECT FOR UPDATE

Actual row order in UPDATE and SELECT FOR UPDATE

From

Nikolai Zhubr

Date:

15 February 2016, 22:18:01

Hello all,

I can't find any clear description of how to reliably figure and/or
enforce specific row (locking) order within UPDATE and SELECT FOR UPDATE
statements dealing with multiple rows.
I'd like to get rid of some deadlocks (caused by share locks). While the
manual explains locks and deadlocks themselves pretty fine (in e.g.
http://www.postgresql.org/docs/9.5/static/explicit-locking.html
) it somehow avoids discussing multi-row updates there. On the other
hand, the UPDATE section of the manual somehow avoids discussing actual
update order and respective locking too.
So is it defined anywhere explicitely? Or do I rather have to convert
all multi-row UPDATE statements into single-row updates and then wrap
them into e.g. plpgsql loops? That would look quite strange...
Any hints?


Thank you,
Nikolai

Re: Actual row order in UPDATE and SELECT FOR UPDATE

From

Adrian Klaver

Date:

15 February 2016, 22:28:23

On 02/15/2016 02:17 PM, Nikolai Zhubr wrote:
> Hello all,
>
> I can't find any clear description of how to reliably figure and/or
> enforce specific row (locking) order within UPDATE and SELECT FOR UPDATE
> statements dealing with multiple rows.
> I'd like to get rid of some deadlocks (caused by share locks). While the
> manual explains locks and deadlocks themselves pretty fine (in e.g.
> http://www.postgresql.org/docs/9.5/static/explicit-locking.html
> ) it somehow avoids discussing multi-row updates there. On the other
> hand, the UPDATE section of the manual somehow avoids discussing actual
> update order and respective locking too.
> So is it defined anywhere explicitely? Or do I rather have to convert
> all multi-row UPDATE statements into single-row updates and then wrap
> them into e.g. plpgsql loops? That would look quite strange...
> Any hints?

http://www.postgresql.org/docs/9.4/interactive/transaction-iso.html

Might also help if you give a code example of what you are trying to do?

>
>
> Thank you,
> Nikolai
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Actual row order in UPDATE and SELECT FOR UPDATE

From

"David G. Johnston"

Date:

15 February 2016, 23:20:16

On Mon, Feb 15, 2016 at 3:17 PM, Nikolai Zhubr <n-a-zhubr@yandex.ru> wrote:

Hello all,

I can't find any clear description of how to reliably figure and/or enforce specific row (locking) order within UPDATE and SELECT FOR UPDATE statements dealing with multiple rows.

SQL is a set-oriented language. Sets do not have order. Therefore I don't understand your goal. That said subqueries and ORDER BY may be solution.

David J.

Re: Actual row order in UPDATE and SELECT FOR UPDATE

From

Vitaly Burovoy

Date:

15 February 2016, 23:41:27

On 2/15/16, Nikolai Zhubr <n-a-zhubr@yandex.ru> wrote:
> Hello all,
>
> I can't find any clear description of how to reliably figure and/or
> enforce specific row (locking) order within UPDATE and SELECT FOR UPDATE
> statements dealing with multiple rows.
> I'd like to get rid of some deadlocks (caused by share locks). While the
> manual explains locks and deadlocks themselves pretty fine (in e.g.
> http://www.postgresql.org/docs/9.5/static/explicit-locking.html
> ) it somehow avoids discussing multi-row updates there. On the other
> hand, the UPDATE section of the manual somehow avoids discussing actual
> update order and respective locking too.

UPDATE deals with rows in order they are fetched from a heap. In a
common case it can be considered as unordered.

> So is it defined anywhere explicitely? Or do I rather have to convert
> all multi-row UPDATE statements into single-row updates and then wrap
> them into e.g. plpgsql loops?

Not a good thought: it'll ruin performance at all.

> That would look quite strange...
> Any hints?

However SELECT can fetch rows in a specific order and locking by FOR
UPDATE clause is applied _after_ sorting. Then you can use CTE[1] with
SELECT ... ORDER BY ... FOR UPDATE to pass all necessary data to the
UPDATE statement. Since UPDATE still deals with unordered rows they
have already locked, and parallel queries are waiting in SELECT
statement rather than in UPDATE:

WITH lck AS (
    SELECT
       id,  -- for WHERE clause in UPDATE

-- you can do calculations here or in-place (see "field2" below)
       field1 + 1 as field1,
       ...
    FROM your_table
    WHERE ...
    ORDER BY id  -- for example
    FOR UPDATE
)
UPDATE your_table t
SET
  field1=lck.field1,  -- lhs is always field of updatable table;
-- rhs must be pointed by a "table" if they are the same in both "tables"

  field2=field2 + 2, -- or if you make changes in-place and it doesn't appear
--in a table mentioned in "FROM" clause, you can avoid table/alias name
...
FROM lck
WHERE
    t.id=lck.id

>
> Thank you,
> Nikolai

[1]http://www.postgresql.org/docs/current/static/queries-with.html
--
Best regards,
Vitaly Burovoy

Re: Actual row order in UPDATE and SELECT FOR UPDATE

From

Nikolai Zhubr

Date:

16 February 2016, 00:52:56

Hello all,
16.02.2016 2:41, Vitaly Burovoy:
[...]
> UPDATE deals with rows in order they are fetched from a heap. In a
> common case it can be considered as unordered.
[...]
> However SELECT can fetch rows in a specific order and locking by FOR
> UPDATE clause is applied _after_ sorting. Then you can use CTE[1] with
Ok. So if I get it correctly, plain UPDATE statement by itself can not
be constructed in such way that reliably avoids deadlocks in case of
possibly overlapping concurrent updates. So in order to be safe, UPDATE
statements will need to always be 'protected' by respective SELECT FOR
UPDATE first. I'd suppose this fact deserves some more explicit mention
in the manual, as it is not so obvious...

Thanks a lot for your code example and comprehensive advice.
I think I'm able to fix my deadlocks now.


Regards,
Nikolai

> SELECT ... ORDER BY ... FOR UPDATE to pass all necessary data to the
> UPDATE statement. Since UPDATE still deals with unordered rows they
> have already locked, and parallel queries are waiting in SELECT
> statement rather than in UPDATE:
>
> WITH lck AS (
>      SELECT
>         id,  -- for WHERE clause in UPDATE
>
> -- you can do calculations here or in-place (see "field2" below)
>         field1 + 1 as field1,
>         ...
>      FROM your_table
>      WHERE ...
>      ORDER BY id  -- for example
>      FOR UPDATE
> )
> UPDATE your_table t
> SET
>    field1=lck.field1,  -- lhs is always field of updatable table;
> -- rhs must be pointed by a "table" if they are the same in both "tables"
>
>    field2=field2 + 2, -- or if you make changes in-place and it doesn't appear
> --in a table mentioned in "FROM" clause, you can avoid table/alias name
> ...
> FROM lck
> WHERE
>      t.id=lck.id
>
>>
>> Thank you,
>> Nikolai
>
> [1]http://www.postgresql.org/docs/current/static/queries-with.html