Re: Generate call graphs in run-time - Mailing list pgsql-hackers

From Joel Jacobson
Subject Re: Generate call graphs in run-time
Date
Msg-id CAASwCXdHB74xRwnCgFD3v+_=Uj57t80ow4xEq0Rn-Tpr_8ggSQ@mail.gmail.com
Whole thread Raw
In response to Re: Generate call graphs in run-time  (Martin Pihlak <martin.pihlak@gmail.com>)
Responses Re: Generate call graphs in run-time
List pgsql-hackers
On Mon, Jan 16, 2012 at 2:23 PM, Martin Pihlak <martin.pihlak@gmail.com> wrote:
> My approach was to add parent oid to the per-backend function stats
> structure - PgStat_BackendFunctionEntry. Also, I changed the hash key
> for that structure to (oid, parent) pair. This means that within the
> backend the function usage is always tracked with the context of
> calling function. This has the nice property that you get the per-parent
> usage stats as well. Also the additional lists for parent tracking are
> avoided.
>
> During pgstat_report_stat() the call graph (with stats) is output
> to logs and the statistics uploaded to collector -- with the parent oid
> removed.

Since you only care about the parentfuncid in one level, it looks like
you will only be able to get a total call graph of all possible
function calls, and not each unique call graph per transaction.
If you have two separate transactions:
T1: a->b, b->c
T2: b->d
You would have two unique call graphs {a->b, b->c} and {b->d}.
The global call graph, where you only keep track of all unique
parentfuncid->funcid pairs, would be {a->b, b->c, b->d}, which lacks
the information on what different call graphs are actually being
executed per transaction.

Also, why remove the parent oid when uploading the statistics to the collector?
It would be nice to have the statistics for each function per parent,
to see where you have a bottleneck which might only be occurring in a
function when called from a specific parent.
Even more fine-grained would be to have the statistics per unique
call-graph, i.e. the entire tree of functions called in the
transactions.

> There is a patch for this and we do use it in production for occasional
> troubleshooting and dependency analysis. Can't attach immediately
> though -- it has some extra cruft in it that needs to be cleaned up.

I would highly appreciate a patch, don't worry about cleaning up, I
can do that, unless it's some code you can't share for other reasons.

> Indeed. Something like a pg_stat_user_function_details view would be
> very useful. Something along the lines of:
>
>   Column     |  Type  |
> --------------+--------+
>  funcid       | oid    |
>  parentfuncid | oid    | <-- new
>  schemaname   | name   |
>  funcname     | name   |
>  calls        | bigint |
>  total_time   | bigint |
>  self_time    | bigint |

funcid->parentfuncid might be sufficient for performance
optimizations, but to automatically generate directional graphs of all
unique call graphs in run-time, you would need all the unique pairs of
funcid->parentfuncid as a singel column, probably a sorted array of
oids[][], example: [[1,2],[1,3],[2,4],[2,5]] if the call craph would
be {1->2, 1->3, 2->4, 2->5}.

>
> And then rewrite pg_stat_user_functions by aggregating the detailed
> view. That'd make the individual pg_stat_get_function* functions a
> bit slower, but that is probably a non-issue - at least not if the
> pg_stat_user_functions view is rewritten to use a SRF.
>
> regards,
> Martin


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: automating CF submissions (was xlog location arithmetic)
Next
From: Heikki Linnakangas
Date:
Subject: Re: BGWriter latch, power saving