Home > mailing lists

Re: [HACKERS] Two pass CheckDeadlock in contentent case - Mailing list pgsql-hackers

From	Masahiko Sawada
Subject	Re: [HACKERS] Two pass CheckDeadlock in contentent case
Date	November 6, 2018 07:16:00
Msg-id	CAD21AoBWpgDNWj_fF5J4xF6prXDgfhx2hS8aWvQPYhYiCZQQ1Q@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] Two pass CheckDeadlock in contentent case (Sokolov Yura <y.sokolov@postgrespro.ru>)
Responses	Re: [HACKERS] Two pass CheckDeadlock in contentent case
List	pgsql-hackers

Tree view

On Wed, Oct 4, 2017 at 12:07 AM Sokolov Yura <y.sokolov@postgrespro.ru> wrote:
>
> On 2017-10-03 17:30, Sokolov Yura wrote:
> > Good day, hackers.
> >
> > During hard workload sometimes process reaches deadlock timeout
> > even if no real deadlock occurred. It is easily reproducible with
> > pg_xact_advisory_lock on same value + some time consuming
> > operation (update) and many clients.
> >
> > When backend reaches deadlock timeout, it calls CheckDeadlock,
> > which locks all partitions of lock hash in exclusive mode to
> > walk through graph and search for deadlock.
> >
> > If hundreds of backends reaches this timeout trying to acquire
> > advisory lock on a same value, it leads to hard-stuck for many
> > seconds, cause they all traverse same huge lock graph under
> > exclusive lock.
> > During this stuck there is no possibility to do any meaningful
> > operations (no new transaction can begin).
> >
> > Attached patch makes CheckDeadlock to do two passes:
> > - first pass uses LW_SHARED on partitions of lock hash.
> >   DeadLockCheck is called with flag "readonly", so it doesn't
> >   modify anything.
> > - If there is possibility of "soft" or "hard" deadlock detected,
> >   ie if there is need to modify lock graph, then partitions
> >   relocked with LW_EXCLUSIVE, and DeadLockCheck is called again.
> >
> > It fixes hard-stuck, cause backends walk lock graph under shared
> > lock, and found that there is no real deadlock.
> >
> > Test on 4 socket xeon machine:
> > pgbench_zipf -s 10 -c 800  -j 100 -M prepared -T 450 -f
> > ./ycsb_read_zipf.sql@50 -f ./ycsb_update_lock2_zipf.sql@50 -P 5
> >
> > ycsb_read_zipf.sql:
> >     \set i random_zipfian(1, 100000 * :scale, 1.01)
> >     SELECT abalance FROM pgbench_accounts WHERE aid = :i;
> > ycsb_update_lock2_zipf.sql:
> >     \set i random_zipfian(1, 100000 * :scale, 1.01)
> >     select lock_and_update( :i );
> >
> >     CREATE OR REPLACE FUNCTION public.lock_and_update(i integer)
> >      RETURNS void
> >      LANGUAGE sql
> >     AS $function$
> >         SELECT pg_advisory_xact_lock( $1 );
> >         UPDATE pgbench_accounts SET abalance = 1 WHERE aid = $1;
> >     $function$
> >
> > Without attached patch:
> >
> > progress: 5.0 s, 45707.0 tps, lat 15.599 ms stddev 83.757
> > progress: 10.0 s, 51124.4 tps, lat 15.681 ms stddev 78.353
> > progress: 15.0 s, 52293.8 tps, lat 15.327 ms stddev 77.017
> > progress: 20.0 s, 51280.4 tps, lat 15.603 ms stddev 78.199
> > progress: 25.0 s, 47278.6 tps, lat 16.795 ms stddev 83.570
> > progress: 30.0 s, 41792.9 tps, lat 18.535 ms stddev 93.697
> > progress: 35.0 s, 12393.7 tps, lat 33.757 ms stddev 169.116
> > progress: 40.0 s, 0.0 tps, lat -nan ms stddev -nan
> > progress: 45.0 s, 0.0 tps, lat -nan ms stddev -nan
> > progress: 50.0 s, 1.2 tps, lat 2497.734 ms stddev 5393.166
> > progress: 55.0 s, 0.0 tps, lat -nan ms stddev -nan
> > progress: 60.0 s, 27357.9 tps, lat 160.622 ms stddev 1866.625
> > progress: 65.0 s, 38770.8 tps, lat 20.829 ms stddev 104.453
> > progress: 70.0 s, 40553.2 tps, lat 19.809 ms stddev 99.741
> >
> > (autovacuum led to trigger deadlock timeout,
> >  and therefore, to stuck)
> >
> > Patched:
> >
> > progress: 5.0 s, 56264.7 tps, lat 12.847 ms stddev 66.980
> > progress: 10.0 s, 55389.3 tps, lat 14.329 ms stddev 71.997
> > progress: 15.0 s, 50757.7 tps, lat 15.730 ms stddev 78.222
> > progress: 20.0 s, 50797.3 tps, lat 15.736 ms stddev 79.296
> > progress: 25.0 s, 48485.3 tps, lat 16.432 ms stddev 82.720
> > progress: 30.0 s, 45202.1 tps, lat 17.733 ms stddev 88.554
> > progress: 35.0 s, 40035.8 tps, lat 19.343 ms stddev 97.787
> > progress: 40.0 s, 14240.1 tps, lat 47.625 ms stddev 265.465
> > progress: 45.0 s, 30150.6 tps, lat 31.140 ms stddev 196.114
> > progress: 50.0 s, 38975.0 tps, lat 20.543 ms stddev 104.281
> > progress: 55.0 s, 40573.9 tps, lat 19.770 ms stddev 99.487
> > progress: 60.0 s, 38361.1 tps, lat 20.693 ms stddev 103.141
> > progress: 65.0 s, 39793.3 tps, lat 20.216 ms stddev 101.080
> > progress: 70.0 s, 38387.9 tps, lat 20.886 ms stddev 104.482
> >
> > (autovacuum led to trigger deadlock timeout,
> >  but postgresql did not stuck)
> >
> > I believe, patch will positively affect other heavy workloads
> > as well, although I have not collect benchmarks.
> >
> > `make check-world` passes with configured `--enable-tap-tests
> > --enable-casserts`
>
> Excuse me, corrected version is in attach.
>

The idea of this patch seems reasonable.

I've looked at this patch. This patch still can be applied cleanly to
the current HEAD and passed regression tests.

Here is review comments.

+       if (readonly)
+       {
+               if (nWaitOrders > 0)
+                       return DS_HARD_DEADLOCK;
+               else if (blocking_autovacuum_proc != NULL)
+                       return DS_BLOCKED_BY_AUTOVACUUM;
+               else
+                       return DS_NO_DEADLOCK;
+       }

WIth this patch DeadLockCheck could return DS_HARD_DEADLOCK in
readonly = true case and we then retry checking deadlock but
DS_HARD_DEADLOCK originally means that we've detected a deadlock that
is un-resolvable (see definition of DedadLockState). So I think it's
better to return DS_SOFT_DEADLOCK. Or at least need to update the
description of DeadLockState.

------
        if (deadlock_state == DS_HARD_DEADLOCK)
        {
+               if (shared)
+               {
+                       shared = false;
+                       for (i = NUM_LOCK_PARTITIONS; --i >= 0;)
+                               LWLockRelease(LockHashPartitionLockByIndex(i));
+                       for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
+
LWLockAcquire(LockHashPartitionLockByIndex(i), LW_EXCLUSIVE);
+                       goto retry;
+               }
+

This code forces us to check deadlocks exactly twice. IIUC if a hard
deadlock is found even while holding a shared lock we can abort.

-----
The comment of DeadLockCheck needs to be updated.

 * DeadLockCheck -- Checks for deadlocks for a given process
 *
 * This code looks for deadlocks involving the given process.  If any
 * are found, it tries to rearrange lock wait queues to resolve the
 * deadlock.  If resolution is impossible, return DS_HARD_DEADLOCK ---
 * the caller is then expected to abort the given proc's transaction.
 *

We might not rearrange if the readonly is true even if we found a deadlock.

-----
The first sentence of the following comment also needs to be updated.

    /*
     * Acquire exclusive lock on the entire shared lock data structures. Must
     * grab LWLocks in partition-number order to avoid LWLock deadlock.
     *

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

pgsql-hackers by date:

From: Amit Langote
Date: 06 November 2018, 07:10:25
Subject: Re: First-draft release notes for back-branch releases

From: "Yotsunaga, Naoki"
Date: 06 November 2018, 07:26:03
Subject: RE: [Proposal] Add accumulated statistics

Re: [HACKERS] Two pass CheckDeadlock in contentent case - Mailing list pgsql-hackers

Previous

Next