Thread: stored procedure stats in collector

stored procedure stats in collector

From

Martin Pihlak

Date:

13 July 2007, 11:18:15

Howdy,

I'm working on a patch to extend the stats collector to handle stored procedure
statistics (call counts, duration etc). The goal is to make this information
visible via pg_stat functions/views. The collection would be controllable via
"stats_function_level" GUC and will have minimal overhead when turned off. At
our company we depend heavily on stored procedures and such a tool would be of
great help. Perhaps others could also find it somewhat useful.

So far I have a "working" patch against 8.2.4 (attached) that places some
instrumentation in the executor (ExecMakeFunctionResult,
ExecMakeFunctionResultNoSets and ExecMakeTableFunctionResult) and send the
stats to the collector (handled similarly to table stats).  The results are
visible through pg_stat_user_functions:


martinp=# set stats_function_level = on;
SET
Time: 0.172 ms

martinp=# select count(f3()) from generate_series(1, 1000);
  count
-------
   1000
(1 row)

Time: 40059.713 ms

martinp=# select * from pg_stat_user_functions ;
  procid | schemaname | procname | nargs | calls | total_time | total_cpu | self_time | self_cpu
--------+------------+----------+-------+-------+------------+-----------+-----------+----------
   16388 | public     | f1       |     0 |  4000 |      14978 |      8352 |     14978 |     8352
   16389 | public     | f2       |     0 |  2000 |      40044 |      8364 |     25066 |       12
   16390 | public     | f3       |     0 |  1000 |      40054 |      8364 |         9 |        0
(3 rows)

Time units are in milliseconds. Only functions with oid >= FirstNormalObjectId
are accounted.

This is of course still very experimental - some work is required to get this
into production shape. I was hoping to get some feedback before I continue
though. Maybe there are some obvious flaws that I'm not seeing, maybe something
needs to be changed to make this more generally useful. Feedback is greatly
appreciated.

Regards,
Martin

Attachment

8.2.4-procstat.patch.gz

Re: stored procedure stats in collector

From

Neil Conway

Date:

13 July 2007, 18:13:38

On Fri, 2007-07-13 at 14:11 +0300, Martin Pihlak wrote:
> I'm working on a patch to extend the stats collector to handle stored procedure
> statistics (call counts, duration etc). The goal is to make this information
> visible via pg_stat functions/views. The collection would be controllable via
> "stats_function_level" GUC and will have minimal overhead when turned off. At
> our company we depend heavily on stored procedures and such a tool would be of
> great help. Perhaps others could also find it somewhat useful.

Very cool, certainly sounds like a useful feature to me.

> martinp=# select * from pg_stat_user_functions ;
>   procid | schemaname | procname | nargs | calls | total_time | total_cpu | self_time | self_cpu
> --------+------------+----------+-------+-------+------------+-----------+-----------+----------
>    16388 | public     | f1       |     0 |  4000 |      14978 |      8352 |     14978 |     8352
>    16389 | public     | f2       |     0 |  2000 |      40044 |      8364 |     25066 |       12
>    16390 | public     | f3       |     0 |  1000 |      40054 |      8364 |         9 |        0
> (3 rows)

(schemaname, procname, nargs) is still ambiguous in the face of function
overloading. Although the presence of procid uniquely identifies each
function anyway, if you're going to include the name and argument
information, it might be worth including the argument types as well (as
an array of regtype, perhaps).

> Only functions with oid >= FirstNormalObjectId are accounted.

Sounds reasonable to me; adding profiling to every DirectFunctionCall
invocation is likely to be too expensive anyway.

>From looking quickly at the patch, I don't think the current coding
handles set-returning functions (ExecMakeTableFunctionResult).

-Neil

Re: stored procedure stats in collector

From

Tom Lane

Date:

16 July 2007, 01:54:16

Martin Pihlak <martin.pihlak@gmail.com> writes:
> I'm working on a patch to extend the stats collector to handle stored procedure
> statistics (call counts, duration etc).
> ...
> Only functions with oid >= FirstNormalObjectId are accounted.

I really dislike that approach to deciding which functions to count.
The main problem with it is that it will try to count C-language
functions that are added after initdb, such as contrib stuff and
third-party add-ons like postgis.  The percentage overhead for a
typical short C function will be large, and I'm not sure anything
much is to be gained by counting these.

I think a more reasonable approach would be to count PL-language
functions; which in turn suggests that the instrumentation hooks
should be in the PL call handlers, not in ExecMakeFunctionResult
and friends where they will drag down performance of all functions.

BTW, I dunno if you've thought about the implications of inlining
of SQL functions ... it's hard to see how to count those reasonably.
        regards, tom lane

Re: stored procedure stats in collector

From

Martin Pihlak

Date:

16 July 2007, 12:31:24

Neil Conway wrote:
> 
> (schemaname, procname, nargs) is still ambiguous in the face of function
> overloading. Although the presence of procid uniquely identifies each
> function anyway, if you're going to include the name and argument
> information, it might be worth including the argument types as well (as
> an array of regtype, perhaps).
> 

This is true. I was being a bit selfish here - in our environment we don't
use overloading that much - schema, name and nargs are usually sufficient.
It is also convinient to be able to do "select * from pg..." and have the
output not to wrap around.

>>From looking quickly at the patch, I don't think the current coding
> handles set-returning functions (ExecMakeTableFunctionResult).
> 

Hmm, should be handled. The number of tuples returned is not counted though,
perhaps this is another thing to add ...

Regards,
Martin

Re: stored procedure stats in collector

From

Martin Pihlak

Date:

16 July 2007, 16:10:53

Tom Lane wrote:
> I really dislike that approach to deciding which functions to count.
> The main problem with it is that it will try to count C-language
> functions that are added after initdb, such as contrib stuff and
> third-party add-ons like postgis.  The percentage overhead for a
> typical short C function will be large, and I'm not sure anything
> much is to be gained by counting these.
> 
Agreed, it is a bit ugly. In the initial version there was no such
exclusion. This came only after I "accidentally" activated the
collection on a busy server. Checked cpu usage some half an hour later
and immediately disabled it. System mode consumption was unacceptably
high, but the results indicated that something funny was going
on inside application as well - some internal functions such as texteq()
and now() had unreasonably high call counts. I'm still investigating
those issues, but maybe it might be useful to have stats for internal
and C language functions as well?

Perhaps it is possible to make the collection configurable per language.
So that for instance, default is to only collect pl languages, with the
option to add C or internal. Not sure how to approach this though.

> I think a more reasonable approach would be to count PL-language
> functions; which in turn suggests that the instrumentation hooks
> should be in the PL call handlers, not in ExecMakeFunctionResult
> and friends where they will drag down performance of all functions.
> 
It would be great if all the procedural languages would go through a
single entry point. Right now the more exotic PL-s would need separate
patching. But indeed, this approach would be less intrusive.

> BTW, I dunno if you've thought about the implications of inlining
> of SQL functions ... it's hard to see how to count those reasonably.
> 
Yes, this creates some inconsistencies in what is collected and what not.
Unless, of course, only PL functions are counted :)

Regards,
Martin

Re: stored procedure stats in collector

From

Martin Pihlak

Date:

20 September 2007, 13:09:26

Howdy,

Here's an updated version of the function stats patch. The biggest change is
that by default only procedural language functions are counted. The GUC variable
stats_function_level now takes 3 values: on, off and all. The latter also
counts SQL and C language functions, "on" means only to count procedural
language functions. The decision is now made in fmgr_info_cxt_security()
Trigger functions are now also counted.

Sample output:

select procname, calls, total_time, total_cpu, self_time, self_cpu
from pg_stat_user_functions order by self_cpu desc limit 5;

       procname      | calls | total_time | total_cpu | self_time | self_cpu
--------------------+-------+------------+-----------+-----------+----------
  next_batch         | 32765 |      27139 |      8574 |     27139 |     8574
  fetch_batch_events |  3636 |       9252 |      5622 |      3771 |     2717
  batch_event_sql    |  3636 |       5454 |      2888 |      3910 |     1962
  finish_batch       |  3420 |       3215 |      1475 |      3215 |     1475
  batch_event_tables |  3636 |       1448 |       865 |      1434 |      858
(5 rows)

This is still 8.2 only (tested on 8.2.4 and 8.2.5), has seen some production
usage here at Skype (about a month on reasonably busy boxes). So far so good.

Couple of issues:
- sometimes self_cpu > self_time - currently blaming it on Linux version of
gettimeofday().
- dropped functions are not purged from stats, might bloat the stats file for
some usage patterns.

PS. Would something like this be a canditate for 8.4 inclusion (if polished up)?

Regards,
Martin

Attachment

8.2-procstat.patch.gz

Re: stored procedure stats in collector

From

Neil Conway

Date:

21 September 2007, 00:34:48

On Thu, 2007-09-20 at 16:08 +0300, Martin Pihlak wrote:
> The GUC variable stats_function_level now takes 3 values: on, off and all.

That seems a confusing set of values. Perhaps "off", "pl", and "all"
would be clearer?

I'm curious if you've measured the performance overhead of enabling this
functionality.

> PS. Would something like this be a canditate for 8.4 inclusion (if polished up)?

It sounds like a useful feature to me.

-Neil

Re: stored procedure stats in collector

From

Martin Pihlak

Date:

24 September 2007, 13:10:48

Neil Conway wrote:
> On Thu, 2007-09-20 at 16:08 +0300, Martin Pihlak wrote:
>> The GUC variable stats_function_level now takes 3 values: on, off and all.
> 
> That seems a confusing set of values. Perhaps "off", "pl", and "all"
> would be clearer?

Makes sense. It appears that the stats_ prefixed GUC names are deprecated now.
Will rename to "track_functions" and change values to "off", "pl" and
"all". Or should I use "none" instead of "off"?

> I'm curious if you've measured the performance overhead of enabling this
> functionality.
> 

Just finished a simple benchmark comparing unpatched, "off" and "all":

unpatched usr    sys    total
dummy     21.36  0.68   22.04
rfact     51.52  0.07   51.59
ifact     20.40  0.06   20.45

"off"     usr    sys    total  diff x
dummy     21.59  0.39   21.97  1.00
rfact     52.40  0.06   52.46  1.02
ifact     20.78  0.02   20.8   1.02

"all"     usr    sys    total  diff x
dummy     21.69  35.18  56.87  2.58
rfact     47.83  38.99  86.83  1.68
ifact     20.65  0.36   21.01  1.03

dummy is blank pl/pgsql function (ran 10000000 times). rfact is recursive factorial,
ifact is iterative (both ran 1000000 times). Comparing median of 3 runs. The usr and
sys values are from log_executor_stats.

It seems that the overhead is unnoticeable if disabled, very visible for lightweight
functions and heavy callers. Almost unnoticeable for more compute intensive functions.

>> PS. Would something like this be a canditate for 8.4 inclusion (if polished up)?
> 
> It sounds like a useful feature to me.
> 
Good. I'll bring the patch up to date with HEAD.

Regards,
Martin

Re: stored procedure stats in collector

From

Tom Lane

Date:

24 September 2007, 15:37:08

Martin Pihlak <martin.pihlak@gmail.com> writes:
> Neil Conway wrote:
>> That seems a confusing set of values. Perhaps "off", "pl", and "all"
>> would be clearer?

> Makes sense. It appears that the stats_ prefixed GUC names are deprecated now.
> Will rename to "track_functions" and change values to "off", "pl" and
> "all". Or should I use "none" instead of "off"?

"None" seems good, by analogy with log_statement's values.

> It seems that the overhead is unnoticeable if disabled, very visible
> for lightweight functions and heavy callers. Almost unnoticeable for
> more compute intensive functions.

The very high system times seem odd.  Maybe you have a machine with
slow gettimeofday()?
        regards, tom lane

Re: stored procedure stats in collector

From

Martin Pihlak

Date:

24 September 2007, 17:08:21

>> It seems that the overhead is unnoticeable if disabled, very visible
>> for lightweight functions and heavy callers. Almost unnoticeable for
>> more compute intensive functions.
> 
> The very high system times seem odd.  Maybe you have a machine with
> slow gettimeofday()?
> 

So it seems (on Ubuntu 7.04). Reran the benchmark on another system and
got more reasonable results:

"none"  usr      sys    total
dummy   15.9     0.68   16.58
rfact   39.37    0.07   39.44
ifact   12.89    0.01   12.9

"all"   usr      sys    total    diff x
dummy   20.82    4.44   25.26    1.52
rfact   45.17    4.17   49.34    1.25
ifact   12.92    0.03   12.96    1.00

Regards,
Martin

Re: stored procedure stats in collector

From

Bruce Momjian

Date:

26 September 2007, 10:42:26

This has been saved for the 8.4 release:
http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Martin Pihlak wrote:
> Howdy,
> 
> Here's an updated version of the function stats patch. The biggest change is
> that by default only procedural language functions are counted. The GUC variable
> stats_function_level now takes 3 values: on, off and all. The latter also
> counts SQL and C language functions, "on" means only to count procedural
> language functions. The decision is now made in fmgr_info_cxt_security()
> Trigger functions are now also counted.
> 
> Sample output:
> 
> select procname, calls, total_time, total_cpu, self_time, self_cpu
> from pg_stat_user_functions order by self_cpu desc limit 5;
> 
>        procname      | calls | total_time | total_cpu | self_time | self_cpu
> --------------------+-------+------------+-----------+-----------+----------
>   next_batch         | 32765 |      27139 |      8574 |     27139 |     8574
>   fetch_batch_events |  3636 |       9252 |      5622 |      3771 |     2717
>   batch_event_sql    |  3636 |       5454 |      2888 |      3910 |     1962
>   finish_batch       |  3420 |       3215 |      1475 |      3215 |     1475
>   batch_event_tables |  3636 |       1448 |       865 |      1434 |      858
> (5 rows)
> 
> This is still 8.2 only (tested on 8.2.4 and 8.2.5), has seen some production
> usage here at Skype (about a month on reasonably busy boxes). So far so good.
> 
> Couple of issues:
> - sometimes self_cpu > self_time - currently blaming it on Linux version of
> gettimeofday().
> - dropped functions are not purged from stats, might bloat the stats file for
> some usage patterns.
> 
> PS. Would something like this be a canditate for 8.4 inclusion (if polished up)?
> 
> Regards,
> Martin
> 

[ application/x-gzip is not supported, skipping... ]

> 
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
> 
>                 http://www.postgresql.org/about/donate

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: stored procedure stats in collector

From

hubert depesz lubaczewski

Date:

26 September 2007, 12:24:21

On Thu, Sep 20, 2007 at 05:34:32PM -0700, Neil Conway wrote:
> That seems a confusing set of values. Perhaps "off", "pl", and "all"
> would be clearer?
> I'm curious if you've measured the performance overhead of enabling this
> functionality.

i'm quite worried about "all" setting.
all operators are functions as well, so tracking literally all function
calls might be too heavy.

on the other hand i would like to be able to track some non-pl functions
like RI%.

depesz

-- 
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

Re: stored procedure stats in collector

From

Martin Pihlak

Date:

26 September 2007, 13:09:09

hubert depesz lubaczewski wrote:
> On Thu, Sep 20, 2007 at 05:34:32PM -0700, Neil Conway wrote:
>> That seems a confusing set of values. Perhaps "off", "pl", and "all"
>> would be clearer?
>> I'm curious if you've measured the performance overhead of enabling this
>> functionality.
> 
> i'm quite worried about "all" setting.
> all operators are functions as well, so tracking literally all function
> calls might be too heavy.
> 
"all" would additionally count only C and SQL functions, builtins (language
"internal") are skipped. Maybe there should be option to count these as well,
perhaps with a really scare name (track_functions = all_plus_internal).

> on the other hand i would like to be able to track some non-pl functions
> like RI%.
> 
These are builtin.

Regards,
Martin

Re: stored procedure stats in collector

From

Bruce Momjian

Date:

26 March 2008, 01:42:38

This has been saved for the next commit-fest:
http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Martin Pihlak wrote:
> Howdy,
> 
> Here's an updated version of the function stats patch. The biggest change is
> that by default only procedural language functions are counted. The GUC variable
> stats_function_level now takes 3 values: on, off and all. The latter also
> counts SQL and C language functions, "on" means only to count procedural
> language functions. The decision is now made in fmgr_info_cxt_security()
> Trigger functions are now also counted.
> 
> Sample output:
> 
> select procname, calls, total_time, total_cpu, self_time, self_cpu
> from pg_stat_user_functions order by self_cpu desc limit 5;
> 
>        procname      | calls | total_time | total_cpu | self_time | self_cpu
> --------------------+-------+------------+-----------+-----------+----------
>   next_batch         | 32765 |      27139 |      8574 |     27139 |     8574
>   fetch_batch_events |  3636 |       9252 |      5622 |      3771 |     2717
>   batch_event_sql    |  3636 |       5454 |      2888 |      3910 |     1962
>   finish_batch       |  3420 |       3215 |      1475 |      3215 |     1475
>   batch_event_tables |  3636 |       1448 |       865 |      1434 |      858
> (5 rows)
> 
> This is still 8.2 only (tested on 8.2.4 and 8.2.5), has seen some production
> usage here at Skype (about a month on reasonably busy boxes). So far so good.
> 
> Couple of issues:
> - sometimes self_cpu > self_time - currently blaming it on Linux version of
> gettimeofday().
> - dropped functions are not purged from stats, might bloat the stats file for
> some usage patterns.
> 
> PS. Would something like this be a canditate for 8.4 inclusion (if polished up)?
> 
> Regards,
> Martin
> 

[ application/x-gzip is not supported, skipping... ]

> 
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
> 
>                 http://www.postgresql.org/about/donate

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +