Re: Memory Accounting v11 - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Memory Accounting v11
Date
Msg-id 557E0016.8050300@2ndquadrant.com
Whole thread Raw
In response to Memory Accounting v11  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
Hi,

On 06/14/15 21:43, Jeff Davis wrote:
> This patch tracks memory usage (at the block level) for all memory
> contexts. Individual palloc()s aren't tracked; only the new blocks
> allocated to the memory context with malloc().

I see it's adding the new field as int64 - wouldn't a Size be more 
appropriate, considering that's what we use in mctx.h and aset.c?

> It also adds a new function, MemoryContextMemAllocated() which can
> either retrieve the total allocated for the context, or it can
> recurse and sum up the allocations for all subcontexts as well.
>
> This is intended to be used by HashAgg in an upcoming patch that will
> bound the memory and spill to disk.
>
> Previous discussion here:
>
> http://www.postgresql.org/message-id/1407012053.15301.53.camel@jeff-desktop
>
> Previous concerns:
>
> * There was a slowdown reported of around 1-3% (depending on the exact
> version of the patch) on an IBM power machine when doing an index
> rebuild. The results were fairly noisy for me, but it seemed to hold up.
> See http://www.postgresql.org/message-id/CA
> +Tgmobnu7XEn1gRdXnFo37P79bF=qLt46=37ajP3Cro9dBRaA@mail.gmail.com
> * Adding a struct field to MemoryContextData may be bad for the CPU
> caching behavior, and may be the cause of the above slowdown.

> * Previous versions of the patch updated the parent contexts'
> allocations as well, which avoided the need to recurse when querying
> the total allocation. That seemed to penalize cases that didn't need
> to track the allocation. We discussed trying to "opt-in" to this
> behavior, but it seemed more awkward than helpful. Now, the patch
> only updates the allocation of the current context, and querying
> means recursing through the child contexts.

I don't think the opt-in idea itself was awkward, it was rather about 
the particular APIs that we came up with, especially when combined with 
the 'context inheritance' thing.

I still think the opt-in approach and updating accounting for the parent 
contexts was the best one, because it (a) minimizes impact in cases that 
don't use the accounting, and (b) makes finding the current amount of 
memory cheap ...

> * There was a concern that, if MemoryContextMemAllocated needs to
> recurse to the child contexts, it will be too slow for HashAgg of
> array_agg, which creates a child context per group. That was solved with
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b419865a814abbca12bdd6eef6a3d5ed67f432e1

I wouldn't say this was "solved" - we have fixed one particular example 
of such aggregate implementation, because it was causing OOM issues with 
many groups, but there may be other custom aggregates using the same 
pattern.

Granted, built-in aggregates are probably more critical than aggregates 
provided by extensions, but I wouldn't dare to mark this solved ...

> My general answer to the performance concerns is that they aren't a
> reason to block this patch, unless someone has a suggestion about how
> to fix them. Adding one field to a struct and a few arithmetic
> operations for each malloc() or realloc() seems reasonable to me.

I'm not buying this, sorry. While I agree that we should not expect the 
memory accounting to be entirely free, we should be very careful about 
the overhead especially if we're dropping the opt-in and thus imposing 
the overhead on everyone.

But "performance concerns are not a reason to block this patch" approach 
seems wrong. With any other patch a 3% regression would be considered a 
serious issue IMNSHO.

> The current state, where HashAgg just blows up the memory, is just
> not reasonable, and we need to track the memory to fix that problem.
> Others have also mentioned that we might want to use this mechanism
> to track memory for other operators, like Sort or HashJoin, which
> might be simpler and more accurate.

Dropping the memory accounting implementations and keeping just this new 
solution would be nice, only if we agree the performance impact to be 
acceptable. We already have accounting solution for each of those 
places, so I don't think the unification alone outweighs the regression.

regards

--
Tomas Vondra                   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Jan de Visser
Date:
Subject: Re: Git humor
Next
From: Michael Nolan
Date:
Subject: Re: On columnar storage