Re: Combining Aggregates - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Combining Aggregates
Date
Msg-id 569DC800.5010805@2ndquadrant.com
Whole thread Raw
In response to Re: Combining Aggregates  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Combining Aggregates  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 01/19/2016 05:14 AM, Robert Haas wrote:
> On Mon, Jan 18, 2016 at 12:34 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Jan 18, 2016 at 12:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Robert Haas <robertmhaas@gmail.com> writes:
>>>> Yeah.  The API contract for an expanded object took me quite a while
>>>> to puzzle out, but it seems to be this: if somebody hands you an R/W
>>>> pointer to an expanded object, you're entitled to assume that you can
>>>> "take over" that object and mutate it however you like.  But the
>>>> object might be in some other memory context, so you have to move it
>>>> into your own memory context.
>>>
>>> Only if you intend to keep it --- for example, a function that is
>>> mutating and returning an object isn't required to move it
>>> somewhere else, if the input is R/W, and I think it generally
>>> shouldn't.
>>>
>>> In the context here, I'd think it is the responsibility of
>>> nodeAgg.c not individual datatype functions to make sure that
>>> expanded objects live where it wants them to.
>>
>> That's how I did it in my prototype, but the problem with that is that
>> spinning up a memory context for every group sucks when there are many
>> groups with only a small number of elements each - hence the 50%
>> regression that David Rowley observed.  If we're going to use expanded
>> objects here, which seems like a good idea in principle, that's going
>> to have to be fixed somehow.  We're flogging the heck out of malloc by
>> repeatedly creating a context, doing one or two allocations in it, and
>> then destroying the context.
>>
>> I think that, in general, the memory context machinery wasn't really
>> designed to manage lots of small contexts.  The overhead of spinning
>> up a new context for just a few allocations is substantial.  That
>> matters in some other situations too, I think - I've commonly seen
>> AllocSetContextCreate taking several percent  of runtime in profiles.
>> But here it's much exacerbated.  I'm not sure whether it's better to
>> attack that problem at the root and try to make AllocSetContextCreate
>> cheaper, or whether we should try to figure out some change to the
>> expanded-object machinery to address the issue.
>
> Here is a patch that helps a good deal.  I changed things so that when
> we create a context, we always allocate at least 1kB.  If that's more
> than we need for the node itself and the name, then we use the rest of
> the space as a sort of keeper block.  I think there's more that can be
> done to improve this, but I'm wimping out for now because it's late
> here.
>
> I suspect something like this is a good idea even if we don't end up
> using it for aggregate transition functions.

I dare to claim that if expanded objects require a separate memory 
context per aggregate state (I don't see why would be the case), it's a 
dead end. Not so long ago we've fixed exactly this issue in array_agg(), 
which addressed a bunch of reported OOM issues and improved array_agg() 
performance quite a bit. It'd be rather crazy introducing the same 
problem to all aggregate functions.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Combining Aggregates
Next
From: David Rowley
Date:
Subject: Re: Combining Aggregates