Re: Undetected Deadlock - Mailing list pgsql-general

From Alvaro Herrera
Subject Re: Undetected Deadlock
Date
Msg-id 202201252311.4jc4gpdzwwu7@alvherre.pgsql
Whole thread Raw
In response to Undetected Deadlock  (Michael Harris <harmic@gmail.com>)
Responses Re: Undetected Deadlock  (Michael Harris <harmic@gmail.com>)
List pgsql-general
On 2022-Jan-25, Michael Harris wrote:

> We've recently updated our application to PG 14.1, and in the test instance we
> have started to see some alarming undetected deadlocks.

This is indeed suspicious / worrisome / curious.

What version were you using previously?

I reformatted the result sets:

> An example of what we have seen is:
> 
>  locktype | database |  relation  | page | tuple | virtualxid | transactionid | classid | objid | objsubid |
virtualtransaction|   pid   |      mode       | granted | fastpath |           waitstart           |
relation
>
----------+----------+------------+------+-------+------------+---------------+---------+-------+----------+--------------------+---------+-----------------+---------+----------+-------------------------------+--------------------------------
>  relation |   529986 | 1842228045 |      |       |            |               |         |       |          |
165/1941408       | 2130531 | AccessShareLock | f       | f        | 2022-01-19 00:32:32.626152+01 |
st.ctr_table_efr_oa
> (1 row)
> 
>  locktype | database |  relation  | page | tuple | virtualxid | transactionid | classid | objid | objsubid |
virtualtransaction|   pid   |        mode         | granted | fastpath | waitstart | relation
 
>
----------+----------+------------+------+-------+------------+---------------+---------+-------+----------+--------------------+---------+---------------------+---------+----------+-----------+--------------------------------
>  relation |   529986 | 1842228045 |      |       |            |               |         |       |          |
75/2193719        | 2128603 | AccessExclusiveLock | t       | f        |           | st.ctr_table_efr_oa
 
> (1 row)
> 
>  locktype | database |  relation  | page | tuple | virtualxid | transactionid | classid | objid | objsubid |
virtualtransaction|   pid   |        mode         | granted | fastpath |           waitstart           | relation
 
>
----------+----------+------------+------+-------+------------+---------------+---------+-------+----------+--------------------+---------+---------------------+---------+----------+-------------------------------+-----------
>  relation |   529986 | 1842231489 |      |       |            |               |         |       |          |
75/2193719        | 2128603 | AccessExclusiveLock | f       | f        | 2022-01-19 00:32:32.924694+01 | st.tpd_oa
 
> (1 row)
> 
>    locktype    | database |  relation  | page | tuple |  virtualxid  | transactionid | classid |   objid   | objsubid
|virtualtransaction |   pid   |         mode          | granted | fastpath |           waitstart           | relation
 
>
---------------+----------+------------+------+-------+--------------+---------------+---------+-----------+----------+--------------------+---------+-----------------------+---------+----------+-------------------------------+-----------
>  relation      |   529986 | 1842231489 |      |       |              |               |         |           |
|165/1941408        | 2130531 | AccessShareLock       | t       | f        |                               | st.tpd_oa
 
> 
> So:
>   pid 2130531 waits for an AccessShareLock on relation 1842228045, blocked by pid 2128603 which holds an
AccessExclusiveLock
>   pid 2128603 waits for an AccessExclusiveLock on relation 1842231489, blocked by pid 2130531 which holds an
AccessShareLock
> 
> The queries being executed by these backends are:
> 
>    pid   |          query_start          |         state_change          | wait_event_type | wait_event | state  |
query
>
---------+-------------------------------+-------------------------------+-----------------+------------+--------+-------------------------------------------------------------------------
>  2128603 | 2022-01-19 00:32:32.924413+01 | 2022-01-19 00:32:32.924413+01 | Lock            | relation   | active |
DROPTABLE st.tpd_oa_18929
 
>  2130531 | 2022-01-19 00:32:32.625706+01 | 2022-01-19 00:32:32.625708+01 | Lock            | relation   | active |
DELETEFROM st.ctr_table_efr_oa_19010 WHERE ropid = 44788868
 
> (2 rows)

I know of no cases in which we fail to detect a deadlock.  Perhaps you
have indeed hit a bug.

> Note that there were a lot of other processes also waiting on relation
> 1842231489 - could that be confusing the deadlock detection routine?

It shouldn't.

> I am also confused about the locks which are being taken out by the
> DELETE query.

Maybe the lock is already taken before the DELETE is run; do you have
any triggers, rules, constraints, or anything?  If you have seen this
several times already, maybe a way to investigate deeper is an
exhaustive log capture of everything that these transactions do, from
the point they begin until they become blocked (log_statement=all).

Perhaps you need to involve other concurrent transactions in order to
cause the problem.

-- 
Álvaro Herrera           39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Tiene valor aquel que admite que es un cobarde" (Fernandel)



pgsql-general by date:

Previous
From: Ivan Panchenko
Date:
Subject: Re: Counting the number of repeated phrases in a column
Next
From: Jian He
Date:
Subject: Re: Counting the number of repeated phrases in a column