Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4) - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
Date
Msg-id 20130109221452.GA28653@awork2.anarazel.de
Whole thread Raw
In response to Re: Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
List pgsql-hackers
On 2013-01-09 15:43:19 -0500, Tom Lane wrote:
> I wrote:
> > I then applied the palloc.h and mcxt.c hunks of your patch and rebuilt.
> > Now I get an average runtime of 16666 ms, a full 2% faster, which is a
> > bit astonishing, particularly because the oprofile results haven't moved
> > much:
>
> I studied the assembly code being generated for palloc(), and I believe
> I see the reason why it's a bit faster: when there's only a single local
> variable that has to survive over the elog call, gcc generates a shorter
> function entry/exit sequence.

Makes sense.

>  I had thought of proposing that we code
> palloc() like this:
>
> void *
> palloc(Size size)
> {
>     MemoryContext context = CurrentMemoryContext;
>
>     AssertArg(MemoryContextIsValid(context));
>
>     if (!AllocSizeIsValid(size))
>         elog(ERROR, "invalid memory alloc request size %lu",
>              (unsigned long) size);
>
>     context->isReset = false;
>
>     return (*context->methods->alloc) (context, size);
> }
>
> but at least on this specific hardware and compiler that would evidently
> be a net loss compared to direct use of CurrentMemoryContext.  I would
> not put a lot of faith in that result holding up on other machines
> though.

Thats not optimized to the same? ISTM the compiler should produce
exactly the same code for both.

> In any case this doesn't explain the whole 2% speedup, but it probably
> accounts for palloc() showing as slightly cheaper than
> MemoryContextAlloc had been in the oprofile listing.

I'd guess that a good part of the cost is just smeared across all
callers and not individually accountable to any function visible in the
profile. Additionally, With functions as short as MemoryContextAllocZero
profiles like oprofile (and perf) also often leak quite a bit of the
actual cost to the callsites in my experience.

I wonder whether it makes sense to "inline" the contents pstrdup()
additionally? My gut feeling is not, but...

I would like to move CurrentMemoryContext to memutils.h, but that seems
to require too many changes.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Index build temp files
Next
From: Andres Freund
Date:
Subject: Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)