Re: Memory Accounting - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Memory Accounting
Date
Msg-id 20190924115819.5u5i67add5evtea6@development
Whole thread Raw
In response to Re: Memory Accounting  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Memory Accounting
List pgsql-hackers
On Tue, Sep 24, 2019 at 02:21:40PM +0900, Michael Paquier wrote:
>On Wed, Jul 24, 2019 at 11:52:28PM +0200, Tomas Vondra wrote:
>> I think Heikki was asking about places with a lot of sub-contexts, which a
>> completely different issue. It used to be the case that some aggregates
>> created a separate context for each group - like array_agg. That would
>> make Jeff's approach to accounting rather inefficient, because checking
>> how much memory is used would be very expensive (having to loop over a
>> large number of contexts).
>
>The patch has been marked as ready for committer for a week or so, but
>it seems to me that this comment has not been addressed, no?  Are we
>sure that we want this method if it proves to be inefficient when
>there are many sub-contexts and shouldn't we at least test such
>scenarios with a worst-case, customly-made, function?

I don't think so.

Aggregates creating many memory contexts (context for each group) was
discussed extensively in the thread about v11 [1] in 2015. And back then
the conclusion was that that's a pretty awful pattern anyway, as it uses
much more memory (no cross-context freelists), and has various other
issues. In a way, those aggregates are wrong and should be fixed just
like we fixed array_agg/string_agg (even without the memory accounting).

The way I see it we can do either eager or lazy accounting. Eager might
work better for aggregates with many contexts, but it does increase the
overhead for the "regular" aggregates with just one or two contexts.
Considering how rare those many-context aggregates are (I'm not aware of
any such aggregate at the moment), it seems reasonable to pick the lazy
accounting.

(Note: Another factor affecting the lazy vs. eager efficiency is the
number of palloc/pfree calls vs. calls to determine amount of mem used,
but that's mostly orthogonal and we cn ignore it here).

So I think the approach Jeff ended up with sensible - certainly as a
first step. We may improve it in the future, of course, once we have
more practical experience.

Barring objections, I do plan to get this committed by the end of this
CF (i.e. sometime later this week).

[1] https://www.postgresql.org/message-id/1434311039.4369.39.camel%40jeff-desktop

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




pgsql-hackers by date:

Previous
From: Andrew Gierth
Date:
Subject: Excessive disk usage in WindowAgg
Next
From: Alvaro Herrera
Date:
Subject: Re: Index Skip Scan