Re: [PATCH] Identify LWLocks in tracepoints - Mailing list pgsql-hackers

From Dmitry Dolgov
Subject Re: [PATCH] Identify LWLocks in tracepoints
Date
Msg-id 20210113112134.pjclrkahpi6bgicp@localhost
Whole thread Raw
In response to [PATCH] Identify LWLocks in tracepoints  (Craig Ringer <craig.ringer@enterprisedb.com>)
Responses Re: [PATCH] Identify LWLocks in tracepoints
List pgsql-hackers
> On Sat, Dec 19, 2020 at 01:00:01PM +0800, Craig Ringer wrote:
>
> The attached patch set follows on from the discussion in [1] "Add LWLock
> blocker(s) information" by adding the actual LWLock* and the numeric
> tranche ID to each LWLock related TRACE_POSTGRESQL_foo tracepoint.
>
> This does not provide complete information on blockers, because it's not
> necessarily valid to compare any two LWLock* pointers between two process
> address spaces. The locks could be in DSM segments, and those DSM segments
> could be mapped at different addresses.
>
> I wasn't able to work out a sensible way to map a LWLock* to any sort of
> (tranche-id, lock-index) because there's no requirement that locks in a
> tranche be contiguous or known individually to the lmgr.
>
> Despite that, the patches improve the information available for LWLock
> analysis significantly.

Thanks for the patches, this could be indeed useful. I've looked through
and haven't noticed any issues with either the tracepoint extensions or
commentaries, except that I find it is not that clear how trance_id
indicates a re-initialization here?

    /* Re-initialization of individual LWLocks is not permitted */
    Assert(tranche_id >= NUM_INDIVIDUAL_LWLOCKS || !IsUnderPostmaster);

> Patch 2 adds the tranche id and lock pointer for each trace hit. This makes
> it possible to differentiate between individual locks within a tranche, and
> (so long as they aren't tranches in a DSM segment) compare locks between
> processes. That means you can do lock-order analysis etc, which was not
> previously especially feasible.

I'm curious in which kind of situations lock-order analysis could be
helpful?

> Traces also don't have to do userspace reads for the tranche name all
> the time, so the trace can run with lower overhead.

This one is also interesting. Just for me to clarify, wouldn't there be
a bit of overhead anyway (due to switching from kernel context to user
space when a tracepoint was hit) that will mask name read overhead? Or
are there any available numbers about it?



pgsql-hackers by date:

Previous
From: Dorian Hoxha
Date:
Subject: Alter timestamp without timezone to with timezone rewrites rows
Next
From: Amit Kapila
Date:
Subject: Re: Single transaction in the tablesync worker?