Sampling Profler for Postgres - Mailing list pgsql-hackers

From ITAGAKI Takahiro
Subject Sampling Profler for Postgres
Date
Msg-id 20090309125146.913C.52131E4D@oss.ntt.co.jp
Whole thread Raw
Responses Re: Sampling Profler for Postgres  ("Dickson S. Guedes" <listas@guedesoft.net>)
Re: Sampling Profler for Postgres  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hello,

I think we need two types of profilers: SQL-based and resource-based.
We have some SQL-based profilers like slow-query logs
(log_min_duration_statement) and contrib/pg_stat_statements in 8.4.
For resource-based profilers, we have DTrace probes[1] and continue to
extend them[2], but unfortunately DTrace only works on Solaris and limited
platforms. Also, it is not so easy for typical users to write profilers
using DTrace without performance degradation.

[1] http://developer.postgresql.org/pgdocs/postgres/dynamic-trace.html
[2] http://archives.postgresql.org/pgsql-hackers/2009-03/msg00226.php


Therefore, I'd like to propose an profiler with sampling approach in 8.5.
The attached patch is an experimental model of the profiler.
Each backends reports its condtion in PgBackendStatus.st_condition
and the stats collector process does polling them every seconds.
This is an extension of the st_waiting field, which reports locking
condition in pg_stat_activity. There are some advantages in portability
and less overhead.

Consideration is needed about how to coexist with DTrace. I added codes to
push/pop conditions just on the same place as TRACE_POSTGRESQL_*_START/DONE().
So, we could merge the codes of DTrace and the profiler, or implement one of
them with another.

I would emphasize that an offical profler is required in this area
because it enables users to share knowledge and documentaions;
information-sharing would be difficult if they use home-made profilers.

Comments welcome.

----
Here is a sample output of the profiler with pgbench on Windows:

$ pgbench -i -s3
$ psql -c "SELECT pg_save_profiles()"
$ pgbench -c4 -T60 -n
transaction type: TPC-B (sort of)
tps = 401.510694

$ psql -c "SELECT * FROM pg_diff_profiles"
 profid |      profname      | percent
--------+--------------------+---------
     19 | XLog:Write         |   23.04    <- means wal contension
     46 | LWLock:WALWrite    |   23.04    <- same as the above
     32 | Lock:Transaction   |   22.61    <- confliction on row locks
     15 | Network:Recv       |    7.83
     21 | Data:Stat          |    4.35    <- lseek() is slow on Windows
      7 | CPU:Execute        |    3.91
      3 | CPU                |    3.91
      1 | Idle:InTransaction |    2.61
      5 | CPU:Rewrite        |    1.74
     16 | Network:Send       |    1.74
      6 | CPU:Plan           |    1.74
     31 | Lock:Tuple         |    1.74
      4 | CPU:Parse          |    0.87
     11 | CPU:Commit         |    0.87
(14 rows)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: small parallel restore optimization
Next
From: Ryan Bradetich
Date:
Subject: Re: Out parameters handling