Re: [PATCH] Identify LWLocks in tracepoints - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: [PATCH] Identify LWLocks in tracepoints
Date
Msg-id CAGRY4nxuKRJCJkwkrYNm9Ta8iWtM=7qS2OxYT4is26wif+mPQg@mail.gmail.com
Whole thread Raw
In response to [PATCH] Identify LWLocks in tracepoints  (Craig Ringer <craig.ringer@enterprisedb.com>)
List pgsql-hackers
On Sat, 19 Dec 2020 at 13:00, Craig Ringer <craig.ringer@enterprisedb.com> wrote:
Hi all

The attached patch set follows on from the discussion in [1] "Add LWLock blocker(s) information" by adding the actual LWLock* and the numeric tranche ID to each LWLock related TRACE_POSTGRESQL_foo tracepoint.


I've attached a systemtap script that makes use of the information exported by the enhanced LWLock tracepoints. It offers something akin to dynamic -DLWLOCK_STATS with automatic statistical aggregation and some selective LWLOCK_DEBUG output.

The script isn't super pretty. I didn't try to separate event-data collection from results output, and there's some duplication in places. But it gives you an idea what's possible when we report lock pointers and tranche IDs to tracepoints and add entry/exit tracepoints.

Key features:

* Collect statistical aggregates on lwlock hold and wait durations across all processes. Stats are grouped by lockmode (shared or exclusive) and by tranche name, as well as rollup stats across all tranches.
* Report lock wait and hold durations for each process when that process exits. Again, reported by mode and tranche name.
* For long lock waits, print the waiter pid and waiting pid, along with each process's backend type and application_name if known, the acquire mode, and the acquire function

The output is intended to be human-readable, but it'd be quite simple to convert it into raw tsv-style output suitable for ingestion into statistical postprocessing or graphing tools.

It should be fairly easy to break down the stats by acquire function if desired, so LWLockAcquire(), LWLockWaitForVar(), and LWLockAcquireOrWait() are reported separately. They're combined in the current output.

Capturing the current query string is pretty simple if needed, but I didn't think it was likely to be especially useful.

Sample output for a pg_regress run attached. Abridged version follows. Here the !!W!! lines are "waited a long time", the !!H!! lines are "held a long time". Then [pid]:MyBackendType tranche_name wait_time_us (wait_time) in wait_func (appliation_name) => [blocker_pid] (blocker_application_name) . If blocker pid wasn't identified it won't be reported - I know how to fix that and will do so soon.

!!W!! [ 93030]:3          BufferContent    12993 (0m0.012993s) in lwlock__acquire__start (pg_regress/text)
!!W!! [ 93036]:3            LockManager    14540 (0m0.014540s) in lwlock__acquire__start (pg_regress/float8) => [ 93045] (pg_regress/regproc)
!!W!! [ 93035]:3          BufferContent    12608 (0m0.012608s) in lwlock__acquire__start (pg_regress/float4) => [ 93034] (pg_regress/oid)
!!W!! [ 93036]:3            LockManager    10301 (0m0.010301s) in lwlock__acquire__start (pg_regress/float8)
!!W!! [ 93043]:3            LockManager    10356 (0m0.010356s) in lwlock__acquire__start (pg_regress/pg_lsn)
!!H!! [ 93033]:3          BufferContent    20579 (0m0.020579s) (pg_regress/int8)
!!W!! [ 93027]:3          BufferContent    10766 (0m0.010766s) in lwlock__acquire__start (pg_regress/char) => [ 93037] (pg_regress/bit)
!!W!! [ 93036]:3                 OidGen    12876 (0m0.012876s) in lwlock__acquire__start (pg_regress/float8)
...

Then the summary rollup at the end of the run. This can also be output periodically during the run. Abbreviated for highlights:

wait locks: all procs                        tranche mode    count        total      avg variance      min      max
  W LW_EXCLUSIVE                              (all)    E    54185     14062734      259  1850265        1    44177
  W LW_SHARED                                 (all)    S     3668      1116022      304  1527261        2    18642

held locks: all procs                        tranche mode    count        total      avg variance      min      max
  H LW_EXCLUSIVE                              (all)    E 10438060    153077259       14    37035        1   195043
  H LW_SHARED                                 (all)    S 14199902     65466934        4     5318        1    44030

all procs by tranche                        tranche mode    count        total      avg variance      min      max
  W tranche                                   (all)    S     3668      1116022      304  1527261        2    18642
  W tranche                                   (all)    E    54185     14062734      259  1850265        1    44177
  W tranche                               WALInsert    E     9839      2393229      243  1180294        2    14209
  W tranche                           BufferContent    E     3012      1726543      573  3869409        2    28186
  W tranche                           BufferContent    S     1664       657855      395  2185694        2    18642
  W tranche                            LockFastPath    E    28314      6327801      223  1278053        1    26133
  W tranche                            LockFastPath    S       87        59401      682  3703217       19     9454
  W tranche                             LockManager    E     7223      2764392      382  2514863        2    44177


Hope this is interesting to someone.
Attachment

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: patch: reduce overhead of execution of CALL statement in no atomic mode from PL/pgSQL
Next
From: Simon Riggs
Date:
Subject: Re: WIP: System Versioned Temporal Table