Buffer usage in EXPLAIN and pg_stat_statements - Mailing list pgsql-hackers

From Itagaki Takahiro
Subject Buffer usage in EXPLAIN and pg_stat_statements
Date
Msg-id 20090817155032.9A28.52131E4D@oss.ntt.co.jp
Whole thread Raw
Responses Re: Buffer usage in EXPLAIN and pg_stat_statements  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
Here is a proposal to add buffer usage information to EXPLAIN and
contrib/pg_stat_statements. We can retrieve new values 'gets',
'reads' and 'temp':

    - gets  : total number of buffer pool access
    - reads : total number of data file access
    - temp  : total number of temp file access (sort)

In EXPLAIN, we can use "EXPLAIN (ANALYZE, BUFFER) ..." syntax.
Each executor node shows buffer usage only in it; parent nodes
don't contain buffer usages in their sub nodes.

In pg_stat_statements, new 3 columns are added to the view.
We can determine queries that consume I/O bandwidth using
"SELECT * FROM pg_stat_statements ORDER BY reads DESC".
We will find out bad queries easily using those buffer and
disk access information in addition to duration statistics.


I implementd this feature using an instrumentation stack. A global
variable CurrentInstrument points top of the stack, and each
Instrumentation are linked with newly added 'prev' field.
The stack must be reset even on error because each innstrumentation
might have been deallocated already. I added codes to reset stack
in main loop of backend for the purpose.

TopInstrument is a special node that sums up all of the child nodes.
It tracks QueryDesc.totaltime and used in pg_stat_statements. There
might be another idea that walking around on planstate tree and
gathering all counters in the module, but very complex codes are
needed. I chose a simple way.


I'll write documentations if this design is accepted.
Comments welcome.


Output samples are below:

[EXPLAIN]
=# EXPLAIN (ANALYZE, BUFFER) SELECT * FROM pgbench_accounts ORDER BY bid;
                                 QUERY PLAN
----------------------------------------------------------
 Sort  (cost=...) (actual ...) (gets=0 reads=0 temp=1309)
   Sort Key: bid
   Sort Method:  external sort  Disk: 10472kB
   ->  Seq Scan on pgbench_accounts  (cost=...) (actual ...) (gets=1798 reads=1644 temp=0)
 Total runtime: 75.867 ms

[contrib/pg_stat_statements]
=# SELECT query, gets, reads FROM pg_stat_statements ORDER BY gets DESC LIMIT 4;
                                query                                 | gets  | reads
----------------------------------------------------------------------+-------+-------
 UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid = $2; | 58628 |     1
 UPDATE pgbench_accounts SET abalance = abalance + $1 WHERE aid = $2; | 26999 |  1929
 UPDATE pgbench_tellers SET tbalance = tbalance + $1 WHERE tid = $2;  | 25474 |     1
 SELECT abalance FROM pgbench_accounts WHERE aid = $1;                | 19950 |     0
(4 rows)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment

pgsql-hackers by date:

Previous
From: Michael Meskes
Date:
Subject: Re: Split-up ECPG patches
Next
From: Heikki Linnakangas
Date:
Subject: Re: hot standby - merged up to CVS HEAD