pg_stat_lwlocks view - lwlocks statistics - Mailing list pgsql-hackers

From Satoshi Nagayasu
Subject pg_stat_lwlocks view - lwlocks statistics
Date
Msg-id 4FE8CA2C.3030809@uptime.jp
Whole thread Raw
Responses Re: pg_stat_lwlocks view - lwlocks statistics
Re: pg_stat_lwlocks view - lwlocks statistics
List pgsql-hackers
Hi all,

I've been working on a new system view, pg_stat_lwlocks, to observe
LWLock, and just completed my 'proof-of-concept' code that can work
with version 9.1.

Now, I'd like to know the possibility of this feature for future
release.

With this patch, DBA can easily determine a bottleneck around lwlocks.
--------------------------------------------------
postgres=# SELECT * FROM pg_stat_lwlocks ORDER BY time_ms DESC LIMIT 10;
 lwlockid | calls  | waits | time_ms
----------+--------+-------+---------
       49 | 193326 | 32096 |   23688
        8 |   3305 |   133 |    1335
        2 |     21 |     0 |       0
        4 | 135188 |     0 |       0
        5 |  57935 |     0 |       0
        6 |    141 |     0 |       0
        7 |  24580 |     1 |       0
        3 |   3282 |     0 |       0
        1 |     41 |     0 |       0
        9 |      3 |     0 |       0
(10 rows)

postgres=#
--------------------------------------------------

In this view,
  'lwlockid' column represents LWLockId used in the backends.
  'calls' represents how many times LWLockAcquire() was called.
  'waits' represents how many times LWLockAcquire() needed to wait
  within it before lock acquisition.
  'time_ms' represents how long LWLockAcquire() totally waited on
  a lwlock.

And lwlocks that use a LWLockId range, such as BufMappingLock or
LockMgrLock, would be grouped and summed up in a single record.
For example, lwlockid 49 in the above view represents LockMgrLock
statistics.

Now, I know there are some considerations.

(1) Performance

  I've measured LWLock performance both with and without the patch,
  and confirmed that this patch does not affect the LWLock perfomance
  at all.

  pgbench scores with the patch:
    tps = 900.906658 (excluding connections establishing)
    tps = 908.528422 (excluding connections establishing)
    tps = 903.900977 (excluding connections establishing)
    tps = 910.470595 (excluding connections establishing)
    tps = 909.685396 (excluding connections establishing)

  pgbench scores without the patch:
    tps = 909.096785 (excluding connections establishing)
    tps = 894.868712 (excluding connections establishing)
    tps = 910.074669 (excluding connections establishing)
    tps = 904.022770 (excluding connections establishing)
    tps = 895.673830 (excluding connections establishing)

  Of course, this experiment was not I/O bound, and the cache hit ratio
  was >99.9%.

(2) Memory space

  In this patch, I added three new members to LWLock structure
  as uint64 to collect statistics.

  It means that those members must be held in the shared memory,
  but I'm not sure whether it's appropriate.

  I think another possible option is holding those statistics
  values in local (backend) process memory, and send them through
  the stat collector process (like other statistics values).

(3) LWLock names (or labels)

  Now, pg_stat_lwlocks view shows LWLockId itself. But LWLockId is
  not easy for DBA to determine actual lock type.

  So, I want to show LWLock names (or labels), like 'WALWriteLock'
  or 'LockMgrLock', but how should I implement it?

Any comments?

Regards,
--
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp

Attachment

pgsql-hackers by date:

Previous
From: "ktm@rice.edu"
Date:
Subject: Re: libpq compression
Next
From: Euler Taveira
Date:
Subject: Re: libpq compression