Re: [PATCH] Identify LWLocks in tracepoints - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [PATCH] Identify LWLocks in tracepoints
Date
Msg-id CA+TgmoY2GtP_HV8DfX1GY-PayosZa1GZ_87O6S61ONzA1HEqvQ@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Identify LWLocks in tracepoints  (Andres Freund <andres@anarazel.de>)
Responses Re: [PATCH] Identify LWLocks in tracepoints  (Andres Freund <andres@anarazel.de>)
Re: [PATCH] Identify LWLocks in tracepoints  (Craig Ringer <craig.ringer@enterprisedb.com>)
List pgsql-hackers
On Mon, Apr 12, 2021 at 11:06 PM Andres Freund <andres@anarazel.de> wrote:
> No, they have to be the same in each. Note how the tranche ID is part of
> struct LWLock. Which is why LWLockNewTrancheId() has to acquire a lock
> etc.

More precisely, if a tranche ID is defined in multiple backends, it
needs to be defined the same way in all of them. But it is possible to
have an extension loaded into some backends and not others and have it
define a tranche ID that other backends know nothing about.

Another point to note is that, originally, I had an idea that each
tranche of lwlocks was situation in a single array somewhere in
memory. Perhaps that was an array of something else, like buffer
descriptors, and the lwlocks were just one element of the struct, or
maybe it was an array specifically of LWLocks, but one way or the
other, there was definitely one array that had all the LWLocks from
that tranche in it. So before the commit in question --
3761fe3c20bb040b15f0e8da58d824631da00caa -- T_ID() used to compute an
offset for a lock within the tranche that was supposed to uniquely
identify the lock. However, the whole idea of an array per tranche
turns out to be broken by design.

Consider parallel query. You could, perhaps, arrange for all the
LWLocks that a particular query needs to be in one tranche. And that's
all fine. But what if there are multiple parallel contexts in
existence at the same time? I think right now that may be impossible
as a practical matter, since for example an SQL function that is
called by a parallel query is supposed to run any SQL statements
inside of it without parallelism. But, that's basically a policy
decision. There's nothing in the parallel context machinery itself
which prevents multiple parallel contexts from being active at the
same time. And if that happens, then you'd have multiple arrays with
the same tranche ID, so how do you identify the locks then? The
pre-3761fe3c20bb040b15f0e8da58d824631da00caa data structure doesn't
work because it has only one place to store an array base, but having
multiple places to store an array base doesn't fix it either because
now you've just given the same identifier to multiple locks.

You could maybe fix it by putting a limit on how many parallel
contexts can be open at the same time, and then having N copies of
each parallelism-related tranche. But that seems ugly and messy and a
burden on extension authors and not really what anybody wants.

You could try to identify locks by pointer addresses, but that's got
security hazards and the addreses aren't portable across all the
backends involved in the parallel query because of how DSM works, so
it's not really that helpful in terms of matching stuff up.

You could identify every lock by a tranche ID + an array offset + a
"tranche instance ID". But where would you store the tranche instance
ID to make it readily accessible, other than in the lock itself?
Andres wasn't thrilled about using even 2 bytes to identify the
LWLock, so he'll probably like having more bytes in there for that
kind of thing even less. And to be honest I wouldn't blame him. We
only need 12 bytes to implement the lock itself -- we can't justify
having more than a couple of additional bytes for debugging purposes.

On a broader level, I agree that being able to find out what the
system is doing is really important. But I'm also not entirely
convinced that having really fine-grained information here to
distinguish between one lock and another is the way to get there.
Personally, I've never run into a problem where I really needed to
know anything more than the tranche name. Like, I've seen problems for
example we can see that there's a lot of contention on
SubtransSLRULock, or there's problems with WALInsertLock. But I can't
really see why I'd need to know which WALInsertLock was experiencing
contention. If we were speaking of buffer content locks, I suppose I
can imagine wanting more details, but it's not really the buffer
number I'd want to know. I'd want to know the database OID, the
relfilenode, the fork number, and the block number. You can argue that
we should just expose the buffer number and let the user sort out the
rest with dtrace/systemtap magic, but that makes it useless in
practice to an awful lot of people, including me.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Nhi Dang
Date:
Subject: GSoC 2021 Proposal Document
Next
From: Tom Lane
Date:
Subject: Re: pg_upgrade check for invalid role-specific default config