Re: Improvement of pg_stat_statement usage about buffer hit ratio - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Improvement of pg_stat_statement usage about buffer hit ratio
Date
Msg-id CAMkU=1xCuKvYzbT_bX-L+_Xe-Bo6C7bXXcne+=1LfCSc9maBHg@mail.gmail.com
Whole thread Raw
In response to Re: Improvement of pg_stat_statement usage about buffer hit ratio  (Peter Geoghegan <pg@heroku.com>)
Responses Re: Improvement of pg_stat_statement usage about buffer hit ratio  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On Mon, Nov 18, 2013 at 10:56 AM, Peter Geoghegan <pg@heroku.com> wrote:
On Mon, Nov 18, 2013 at 10:49 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> The same idea was proposed before but not committed because
> Itagaki thought that pg_stat_statements view should report only raw values.
> Please read the following thread. I have the same feeling with him.
> Anyway we should listen to more opinions from other people, though.
> http://www.postgresql.org/message-id/20091222172719.8B86.52131E4D@oss.ntt.co.jp

+1 from me.

That's +1 for *not* including this?  If so, I agree as well.  It would be easy enough to create a sql view that computes this if one wanted (unlike the min and max execution time, which can't be done externally).  I also have a theological opposition to exposing the buffer hit ratio.  There are two ways to improve the buffer hit ratio.  One is to have fewer misses, which is worthwhile but you can more easily do it by looking directly at the number of misses or misses per execution, rather than a buffer hit ratio. Or you can dilute out the misses by gratuitously increasing the "hits" by uselessly reading cached buffers over and over again, which is counter-productive and evil and perverse.  

Take a small to medium look-up table, drop all the indexes so it has to be full scanned all the time, and maybe rebuild it with a lower fillfactor (but not so low that it becomes too big to cache), and watch that buffer hit ratio go through the roof, while true performance goes the other way.

 
I think that a higher level tool needs to be built on
pg_stat_statements to make things easy for those that want a slick,
pre-packaged solution. As I said on the min/max thread, if we're not
doing enough to help people who would like to build such a tool, we
should discuss how we can do better.

I'd like to be able to separate queries by the application_name and/or client_addr which issued them.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: additional json functionality
Next
From: Bruce Momjian
Date:
Subject: Re: LISTEN / NOTIFY enhancement request for Postgresql