Home > mailing lists

Re: Combining Aggregates - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: Combining Aggregates
Date	January 19, 2016 04:14:45
Msg-id	CA+TgmoZeEfY5zOgxERgD7dsaKR_yo2GFFWqW6z+riZSC+_E=ng@mail.gmail.com Whole thread
In response to	Re: Combining Aggregates (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Combining Aggregates Re: Combining Aggregates Re: Combining Aggregates
List	pgsql-hackers

Tree view

On Mon, Jan 18, 2016 at 12:34 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Jan 18, 2016 at 12:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> Yeah.  The API contract for an expanded object took me quite a while
>>> to puzzle out, but it seems to be this: if somebody hands you an R/W
>>> pointer to an expanded object, you're entitled to assume that you can
>>> "take over" that object and mutate it however you like.  But the
>>> object might be in some other memory context, so you have to move it
>>> into your own memory context.
>>
>> Only if you intend to keep it --- for example, a function that is mutating
>> and returning an object isn't required to move it somewhere else, if the
>> input is R/W, and I think it generally shouldn't.
>>
>> In the context here, I'd think it is the responsibility of nodeAgg.c
>> not individual datatype functions to make sure that expanded objects
>> live where it wants them to.
>
> That's how I did it in my prototype, but the problem with that is that
> spinning up a memory context for every group sucks when there are many
> groups with only a small number of elements each - hence the 50%
> regression that David Rowley observed.  If we're going to use expanded
> objects here, which seems like a good idea in principle, that's going
> to have to be fixed somehow.  We're flogging the heck out of malloc by
> repeatedly creating a context, doing one or two allocations in it, and
> then destroying the context.
>
> I think that, in general, the memory context machinery wasn't really
> designed to manage lots of small contexts.  The overhead of spinning
> up a new context for just a few allocations is substantial.  That
> matters in some other situations too, I think - I've commonly seen
> AllocSetContextCreate taking several percent  of runtime in profiles.
> But here it's much exacerbated.  I'm not sure whether it's better to
> attack that problem at the root and try to make AllocSetContextCreate
> cheaper, or whether we should try to figure out some change to the
> expanded-object machinery to address the issue.

Here is a patch that helps a good deal.  I changed things so that when
we create a context, we always allocate at least 1kB.  If that's more
than we need for the node itself and the name, then we use the rest of
the space as a sort of keeper block.  I think there's more that can be
done to improve this, but I'm wimping out for now because it's late
here.

I suspect something like this is a good idea even if we don't end up
using it for aggregate transition functions.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

faster-memory-contexts-v1.patch

pgsql-hackers by date:

From: David Rowley
Date: 19 January 2016, 04:00:27
Subject: Re: Combining Aggregates

From: Tomas Vondra
Date: 19 January 2016, 04:24:23
Subject: Re: multivariate statistics v9

Re: Combining Aggregates - Mailing list pgsql-hackers

Attachment

Previous

Next