Re: [PATCH] Identify LWLocks in tracepoints - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [PATCH] Identify LWLocks in tracepoints |
Date | |
Msg-id | CA+TgmoY2GtP_HV8DfX1GY-PayosZa1GZ_87O6S61ONzA1HEqvQ@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Identify LWLocks in tracepoints (Andres Freund <andres@anarazel.de>) |
Responses |
Re: [PATCH] Identify LWLocks in tracepoints
Re: [PATCH] Identify LWLocks in tracepoints |
List | pgsql-hackers |
On Mon, Apr 12, 2021 at 11:06 PM Andres Freund <andres@anarazel.de> wrote: > No, they have to be the same in each. Note how the tranche ID is part of > struct LWLock. Which is why LWLockNewTrancheId() has to acquire a lock > etc. More precisely, if a tranche ID is defined in multiple backends, it needs to be defined the same way in all of them. But it is possible to have an extension loaded into some backends and not others and have it define a tranche ID that other backends know nothing about. Another point to note is that, originally, I had an idea that each tranche of lwlocks was situation in a single array somewhere in memory. Perhaps that was an array of something else, like buffer descriptors, and the lwlocks were just one element of the struct, or maybe it was an array specifically of LWLocks, but one way or the other, there was definitely one array that had all the LWLocks from that tranche in it. So before the commit in question -- 3761fe3c20bb040b15f0e8da58d824631da00caa -- T_ID() used to compute an offset for a lock within the tranche that was supposed to uniquely identify the lock. However, the whole idea of an array per tranche turns out to be broken by design. Consider parallel query. You could, perhaps, arrange for all the LWLocks that a particular query needs to be in one tranche. And that's all fine. But what if there are multiple parallel contexts in existence at the same time? I think right now that may be impossible as a practical matter, since for example an SQL function that is called by a parallel query is supposed to run any SQL statements inside of it without parallelism. But, that's basically a policy decision. There's nothing in the parallel context machinery itself which prevents multiple parallel contexts from being active at the same time. And if that happens, then you'd have multiple arrays with the same tranche ID, so how do you identify the locks then? The pre-3761fe3c20bb040b15f0e8da58d824631da00caa data structure doesn't work because it has only one place to store an array base, but having multiple places to store an array base doesn't fix it either because now you've just given the same identifier to multiple locks. You could maybe fix it by putting a limit on how many parallel contexts can be open at the same time, and then having N copies of each parallelism-related tranche. But that seems ugly and messy and a burden on extension authors and not really what anybody wants. You could try to identify locks by pointer addresses, but that's got security hazards and the addreses aren't portable across all the backends involved in the parallel query because of how DSM works, so it's not really that helpful in terms of matching stuff up. You could identify every lock by a tranche ID + an array offset + a "tranche instance ID". But where would you store the tranche instance ID to make it readily accessible, other than in the lock itself? Andres wasn't thrilled about using even 2 bytes to identify the LWLock, so he'll probably like having more bytes in there for that kind of thing even less. And to be honest I wouldn't blame him. We only need 12 bytes to implement the lock itself -- we can't justify having more than a couple of additional bytes for debugging purposes. On a broader level, I agree that being able to find out what the system is doing is really important. But I'm also not entirely convinced that having really fine-grained information here to distinguish between one lock and another is the way to get there. Personally, I've never run into a problem where I really needed to know anything more than the tranche name. Like, I've seen problems for example we can see that there's a lot of contention on SubtransSLRULock, or there's problems with WALInsertLock. But I can't really see why I'd need to know which WALInsertLock was experiencing contention. If we were speaking of buffer content locks, I suppose I can imagine wanting more details, but it's not really the buffer number I'd want to know. I'd want to know the database OID, the relfilenode, the fork number, and the block number. You can argue that we should just expose the buffer number and let the user sort out the rest with dtrace/systemtap magic, but that makes it useless in practice to an awful lot of people, including me. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: