Re: contrib/pg_stat_statements 1202 - Mailing list pgsql-hackers

From Vladimir Sitnikov
Subject Re: contrib/pg_stat_statements 1202
Date
Msg-id 1d709ecc0812082356n6f7d0c95k180deeaa39f47dd@mail.gmail.com
Whole thread Raw
In response to Re: contrib/pg_stat_statements 1202  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
List pgsql-hackers

> > 2. EXPLAIN ANALYZE VERBOSE shows buffer statistics in 'actual' section.
> >
> I do not get the point of "VERBOSE".
> As far as I understand, explain analyze (without verbose) will anyway add
> overhead for calculation of "gets/hits/cpu". Why discard that information in
> "non verbose" mode? Just to make the investigation more complex?

I thought that output of new counters are too wide and it brakes
compatibility of EXPLAIN ANALYZE. On the other hand, we don't have to
think of compatibility in EXPLAIN ANALYZE VERBOSE because it is newly
added in 8.4. However, overheads should be avoided. We could have
two kinds of instrumentations, time-only or all-stats.
I am not sure if this impact (in terms of compatibility) is really that big. As far as I could understand, pgAdmin parses modified explain analyze output well. However, pgAdmin does not support "verbose".

The impact in terms of "computational" overhead could be measured. I guess it depends only on the number of calls to the instrumentation (that is either rows or loops). Thus, after explaining some basic select from generate_series, one could adjust the timings. We could even incorporate this to EXPLAIN ANALYZE, so it would display "cpu is 0.5s, while instrumentation cpu is 0.3s".
 

Another idea is to have buffer and cpu counters not in Instrumentation
but in QueryDesc (i.e, only one per query). A new field for the counter
will be initialized in pg_stat_statements module. EXPLAIN ANALYZE could
also use the field but the result is printed only the end of query plan,
as the follows. We can avoid too-wide-line problem with the approach.
Single number per query is sufficient only for pg_stat_statements purposes. That will give an insight of what the top consuming queries are (by cpu time, by gets, etc).
However, single "gets=... reads=..." is not sufficient to pinpoint the problem especially in case of complex query (that is comparable to "query returned N rows" vs "this plan node returned N rows") .

 
=# EXPLAIN ANALYZE SELECT ...;
       QUERY PLAN
---------------------------
 ...
 Total runtime: 24.273 ms
 CPU usage: user=20.2 sys=2.4               #
 Buffer usage: gets=100 reads=10 writes=20  # added lines
 Tempfile i/o: reads=50 writes=50           #
(10 rows)
I wish pgAdmin (or whatever client) had an option to fetch that counters for each and every SQL query and display the consumed resources at a separate tab. I mean, even before/after plain "select" (without any explain). That will show you how the query would behave without any instrumentation.

Regards,
Vladimir Sitnikov

pgsql-hackers by date:

Previous
From: ITAGAKI Takahiro
Date:
Subject: Re: contrib/pg_stat_statements 1202
Next
From: "Fujii Masao"
Date:
Subject: Re: Sync Rep: First Thoughts on Code