Home > mailing lists

Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept) - Mailing list pgsql-hackers

From	Bruce Momjian
Subject	Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
Date	October 3, 2014 18:33:26
Msg-id	20141003153318.GA1561@momjian.us Whole thread Raw
In response to	Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept) (Andres Freund <andres@2ndquadrant.com>)
Responses	Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept) Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept) Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
List	pgsql-hackers

Tree view

On Thu, Oct  2, 2014 at 11:50:14AM +0200, Andres Freund wrote:
> The first problem that comes to my mind about collecting enough data is
> that we have a very large number of lwlocks (fixed_number + 2 *
> shared_buffers). One 'trivial' way of implementing this is to have a per
> backend array collecting the information, and then a shared one
> accumulating data from it over time. But I'm afraid that's not going to
> fly :(. Hm. With the above sets of stats that'd be ~50MB per backend...
> 
> Perhaps we should somehow encode this different for individual lwlock
> tranches? It's far less problematic to collect all this information for
> all but the buffer lwlocks...

First, I think this could be a major Postgres feature, and I am excited
someone is working on this.

As far as gathering data, I don't think we are going to do any better in
terms of performance/simplicity/reliability than to have a single PGPROC
entry to record when we enter/exit a lock, and having a secondary
process scan the PGPROC array periodically.

What that gives us is almost zero overhead on backends, high
reliability, and the ability of the scan daemon to give higher weights
to locks that are held longer.  Basically, if you just stored the locks
you held and released, you either have to add timing overhead to the
backends, or you have no timing information collected.  By scanning
active locks, a short-lived lock might not be seen at all, while a
longer-lived lock might be seen by multiple scans.  What that gives us
is a weighting of the lock time with almost zero overhead.   If we want
finer-grained lock statistics, we just increase the number of scans per
second.

I am assuming almost no one cares about the number of locks, but rather
they care about cummulative lock durations.

I am having trouble seeing any other option that has such a good
cost/benefit profile.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

pgsql-hackers by date:

From: Marco Nenciarini
Date: 03 October 2014, 18:31:56
Subject: [RFC] Incremental backup v2: add backup profile to base backup

From: Robert Haas
Date: 03 October 2014, 18:48:18
Subject: Re: Typo fixes in src/backend/rewrite/rewriteHandler.c

Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept) - Mailing list pgsql-hackers

Previous

Next