Re: (full) Memory context dump considered harmful - Mailing list pgsql-hackers

From Tom Lane
Subject Re: (full) Memory context dump considered harmful
Date
Msg-id 28765.1440182256@sss.pgh.pa.us
Whole thread Raw
In response to Re: (full) Memory context dump considered harmful  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: (full) Memory context dump considered harmful  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: (full) Memory context dump considered harmful  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 08/20/2015 11:04 PM, Stefan Kaltenbrunner wrote:
>> On 08/20/2015 06:09 PM, Tom Lane wrote:
>>> (The reason I say "lobotomize" is that there's no particularly
>>> good reason to assume that the first N lines will tell you what you
>>> need to know. And the filter rule would have to be *very* stupid,
>>> because we can't risk trying to allocate any additional memory to
>>> track what we're doing here.)

> IMHO the first thing we might do is provide log_memory_stats GUC 
> controlling that. I'm not a big fan of adding GUC for everything, but in 
> this case it seems appropriate, just like the other log_ options.

I don't think this is advisable.  You would have to either keep it turned
off and risk not being able to debug OOM situations, or keep it turned on
and risk log-volume problems; neither is very satisfactory.

> I also don't think logging just subset of the stats is a lost case. 
> Sure, we can't know which of the lines are important, but for example 
> logging just the top-level contexts with a summary of the child contexts 
> would be OK.

I thought a bit more about this.  Generally, what you want to know about
a given situation is which contexts have a whole lot of allocations
and/or a whole lot of child contexts.  What you suggest above won't work
very well if the problem is buried more than about two levels down in
the context tree.  But suppose we add a parameter to memory context stats
collection that is the maximum number of child contexts to print *per
parent context*.  If there are more than that, summarize the rest as per
your suggestion.  So any given recursion level might look like
    FooContext: m total in n blocks ...      ChildContext1: m total in n blocks ...        possible grandchildren...
 ChildContext2: m total in n blocks ...        possible grandchildren...      ChildContext3: m total in n blocks ...
   possible grandchildren...      k more child contexts containing m total in n blocks ...
 

This would require a fixed amount of extra state per recursion level,
so it could be done with a few more parameters/local variables in
MemoryContextStats and no need to risk a malloc().

The case where you would lose important data is where the serious bloat
is in some specific child context that is after the first N children of
its direct parent.  I don't believe I've ever seen a case where that was
critical information as long as N isn't too tiny.

I think we could hard-wire N at 100 or something like that and pretty
much fix Stefan's complaint, while losing little if any detail in typical
cases.  Manual invocation of MemoryContextStats could pass larger values
if you really needed it during debugging.

Thoughts?
        regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Reduce ProcArrayLock contention
Next
From: Tom Lane
Date:
Subject: Re: More WITH