On 08/22/2015 06:25 AM, Tomas Vondra wrote:
> On 08/21/2015 08:37 PM, Tom Lane wrote:
>> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>>>
>>> I also don't think logging just subset of the stats is a lost case.
>>> Sure, we can't know which of the lines are important, but for example
>>> logging just the top-level contexts with a summary of the child contexts
>>> would be OK.
>>
>> I thought a bit more about this. Generally, what you want to know about
>> a given situation is which contexts have a whole lot of allocations
>> and/or a whole lot of child contexts. What you suggest above won't work
>> very well if the problem is buried more than about two levels down in
>> the context tree. But suppose we add a parameter to memory context stats
>> collection that is the maximum number of child contexts to print *per
>> parent context*. If there are more than that, summarize the rest as per
>> your suggestion. So any given recursion level might look like
>>
>> FooContext: m total in n blocks ...
>> ChildContext1: m total in n blocks ...
>> possible grandchildren...
>> ChildContext2: m total in n blocks ...
>> possible grandchildren...
>> ChildContext3: m total in n blocks ...
>> possible grandchildren...
>> k more child contexts containing m total in n blocks ...
>>
>> This would require a fixed amount of extra state per recursion level,
>> so it could be done with a few more parameters/local variables in
>> MemoryContextStats and no need to risk a malloc().
>>
>> The case where you would lose important data is where the serious
>> bloat is in some specific child context that is after the first N
>> children of its direct parent. I don't believe I've ever seen a case
>> where that was critical information as long as N isn't too tiny.
>
> Couldn't we make it a bit smarter to handle even cases like this? For
> example we might first count/sum the child contexts, and then print
> either all child contexts (if there are only a few of them) or just
> those that are >5% of the total, 2x the average or something like that.
While having that kind of logic would be nice, i dont think it is
required. For the case I had the proposed patch from tom seems perfectly
fine to me - not sure we would want a GUC. From a DBA perspective I dont
think anybody needs millions of lines of almost duplicated memory
context dumps and also not sure we need it from a developer perspective
either (other than the information: "there were more than those I printed")
regards
Stefan