Re: Adding wait events statistics - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Adding wait events statistics
Date
Msg-id CA+TgmobptuUWo7X5zcQrWKh22qeAn4eL+=wtb8_ajCOR+7_tcw@mail.gmail.com
Whole thread Raw
In response to Re: Adding wait events statistics  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Responses Re: Adding wait events statistics
List pgsql-hackers
On Tue, Jul 22, 2025 at 8:24 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
> So based on the cycles metric I think it looks pretty safe to implement for the
> whole majority of classes.

I'm not convinced that this is either cheap enough to implement, and I
don't understand the value proposition, either. I see the first couple
of messages in the thread say that this is important for
troubleshooting problems in "our user base" (not sure about the
antecedent of "our") but the description of what those problems are
seems pretty vague to me. I've tried counting wait events of a
particular type and the resulting data was extremely misleading and
therefore useless.

I think it's probably a mistake to even be thinking about this in
terms of wait events. It seems very reasonable to want more data about
what PostgreSQL backends are doing, but I don't really see any reason
why that should happen to line up with whatever wait events do. For
instance, if you asked "what information is useful to gather about
heavyweight locks?" you might say "well, we'd like to know how many
times we tried to acquire one, and how many of those times we had to
wait, and how many of those times we waited for longer than
deadlock_timeout". And then you could design a facility to answer
those questions. Or you might say "we'd like a history of all the
different heavyweight locks that a certain backend has tried to
acquire," and then you could design a tracing facility to give you
that. Either of those hypothetical facilities involve providing more
information than you would get from just counting wait events, or
counting+timing wait events, or recording a complete history of all
wait events.

And, I would say that, for more or less that exact reason, there's a
real use case for either of them. Maybe either or both are valuable
enough to add and maybe they are not, and maybe the overhead is
acceptable or maybe it isn't, but I think the arguments are much
better than for a facility that just counts and times all the wait
events. For instance, the former facility would let you answer the
question "what percentage of lock acquisitions are contended?" whereas
a pure wait event count just lets you answer the question "how many
contended lock acquisitions were there?". The latter isn't useless,
but the former is a lot better, particularly since as I proposed it,
you can also judge how many of those contention events involved a
brief wait vs. a long one.

But if, on the other hand, you look at LWLocks, it's a totally
different situation, IMHO. I'm almost sure that measuring LWLock wait
times is going to be too expensive to be practical, and I don't really
see why you'd want to: the right approach is sampling, which is cheap
and in my experience highly effective. Measuring counts doesn't seem
very useful either: knowing the number of times that somebody tried to
acquire a relation lock or a tuple lock arguably tells you something
about your workload that you might want to know, whereas I would argue
that knowing the number of times that somebody tried to acquire a
buffer lock doesn't really tell you anything at all. What you might
want to know is how many buffers you accessed, which is why we've
already got a system for tracking that. That's not to say that there's
nothing at all that somebody could want to know about LWLocks that you
can't already find out today: for example, a system that identifies
which buffers are experiencing significant buffer lock contention, by
relation OID and block number, sounds quite handy. But just counting
wait events, or counting and timing them, will not give you that.
Knowing which SLRUs are being heavily used could also be useful, but I
think there's a good argument that having some instrumentation that
cuts across SLRU types and exposes a bunch of useful numbers for each
could be more useful than just hoping you can figure it out from
LWLock wait event counts.

In short, I feel like just counting, or counting+timing, all the wait
events is painting with too broad a brush. Wait events get inserted
for a specific purpose: so you can know why a certain backend is
off-CPU without having to use debugging tools. They serve that purpose
pretty well, but that doesn't mean that they serve other purposes
well, and I find it kind of hard to see the argument that just
sticking a bunch of counters or timers in the same places where we put
the wait event calls would be the right thing in general.
Introspection is an important topic and, IMHO, deserves much more
specific and nuanced thought about what we're trying to accomplish and
how we're going about it.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: proposal: schema variables
Next
From: Álvaro Herrera
Date:
Subject: trivial grammar refactor