7.5. Distributed Deadlock Detection #

Distributed deadlocks may occur during the processing of distributed transactions. Let us consider the following example:

    create table players(id int, username text, pass text) with (distributed_by='id');
    insert into players select id, 'user_' || id, 'pass_' || id from generate_series(1,1000) id;

Assume that the record with id=2 belongs to node1 and the record with id=3 belongs to node2.

Let us execute the following commands on different nodes:

    node1=# begin;
    node1=# update players set pass='someval' where id=3;

    node2=# begin;
    node2=# update players set pass='someval' where id=2;

    -- it should stuck because transaction on node1 locked record with id=3
    node2=# update players set pass='someval2' where id=3;

    -- it should stuck because transaction on node2 locked record with id=2
    node1=# update players set pass='someval2' where id=2;

A distributed deadlock situation arises when transactions are mutually locked by each other. PostgreSQL has an internal mechanism for deadlock detection, which detects mutual locking between child processes of a single PostgreSQL instance (backend) and resolves it. However, this mechanism is not applicable to the discovered situation because mutual locking is distributed, i.e., backends from different nodes are involved. From the point of view of the PostgreSQL lock manager, there is no deadlock condition because processes of the single instance are not locking each other. Therefore, Shardman has its own mechanism for distributed deadlock resolution.

We can represent the interaction between processes in the entire cluster as a graph. A graph vertex represents a process (backend), which we can identify with a couple of attributes {rgid; vxid}, where rgid is the replication group ID, and vxid is the virtual transaction ID of the currently executed transaction. Graph edges represent directional connections between vertices. Each connection is directed from the locked process to the locking process.

It is obvious that any process can be locked by only one process. In other words, if the backend is waiting for a lock, it can only wait for a specific lock. On the other hand, a locking process can acquire multiple locks, meaning that it can lock multiple backends simultaneously.

With that said, the lock graph acts as a singly linked list. If this list contains a closed loop, then here is a deadlock condition. To detect a deadlock, it is necessary to build such a list and detect closed loops in it.

The distributed deadlock detector in Shardman is implemented as a separate task inside the Shardman monitor. If a process is unable to acquire a lock within a specified amount of time (which is one second by default, but can be adjusted using the deadlock_timeout configuration parameter), the internal PostgreSQL deadlock detector attempts to detect a local deadlock. If no local deadlock is found, the distributed deadlock detector is activated.

The distributed deadlock detector builds a graph (list) of locks in the cluster. It queries views pg_locks and pg_stat_activity on the local node and on each of the remote cluster nodes.

The process of building the list of locks involves sequentially querying nodes in the cluster, and it is not atomic, so the list is not consistent. This means that the distributed deadlock detector may produce false positives. During the building of the list, we can store a lock that can disappear before the end of the list building process. To guarantee the reliability of deadlock detection, after the detection of a closed loop, it is necessary to re-query the nodes involved in the closed loop.

After finding the closed loop, the distributed deadlock detector chooses the process belonging to the local node and cancels it. The user process served by the cancelled backend will receive a message:

    canceling statement due distributed deadlock was found

A verbose message about the detected deadlock will be recorded in the server logs:

    LOG:  distributed deadlock detected
    DETAIL:  repgroup 1, PID 95264 (application 'psql'), executed query 'update players set pass='qqq' where id=2;' is blocked by repgroup 1, PID 95283 (application 'pgfdw:2:95278:9/2'), executed query 'UPDATE public.players_0 SET pass = 'qqq'::text WHERE ((id = 2))'
    repgroup 1, PID 95283 (application 'pgfdw:2:95278:9/2'), executed query 'UPDATE public.players_0 SET pass = 'qqq'::text WHERE ((id = 2))' is blocked by repgroup 2, PID 95278 (application 'psql'), executed query 'update players set pass='qqq' where id=3;'
    repgroup 2, PID 95278 (application 'psql'), executed query 'update players set pass='qqq' where id=3;' is blocked by repgroup 2, PID 95267 (application 'pgfdw:1:95264:8/4'), executed query 'UPDATE public.players_1 SET pass = 'qqq'::text WHERE ((id = 3))'
    repgroup 2, PID 95267 (application 'pgfdw:1:95264:8/4'), executed query 'UPDATE public.players_1 SET pass = 'qqq'::text WHERE ((id = 3))' is blocked by repgroup 1, PID 95264 (application 'psql'), executed query 'update players set pass='qqq' where id=2;'

pdf