Re: MemSet inline for newNode - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: MemSet inline for newNode
Date
Msg-id 200211110336.gAB3aDw10925@candle.pha.pa.us
Whole thread Raw
In response to Re: MemSet inline for newNode  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: MemSet inline for newNode
List pgsql-patches
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> But with any reasonably smart compiler, those *will* be the same because
> >> the compiler can do constant-folding (ie, it knows i is 256 when control
> >> reaches the MemSet).  Pushing the MemSet into MemoryContextAlloc
> >> eliminates the possibility of knowing the size argument's value.
>
> > Yes, however, I am not seeing that getting optimized with gcc 2.95 -O2.
>
> You aren't?  I just checked gcc 2.95.3 on HPPA; it definitely propagates
> the constant at -O2.

I am definately seeing different timings using the constant and the
variable in the test.

>
> > Well, the palloc0 use by newNode was a performance boost, as tested by
> > Neil Conway.  Without palloc0, there was no way to inline newNode.
>
> But your patch as committed does NOT inline newNode, in fact it does the
> exact opposite: the MemSet macro is removed from the callsite.

Yes, there were actually two patches;  the first inlined newNode by
calling palloc0.  Without palloc, there was no way to inline newNode.

The second was a more general one to merge palloc/MemSet into a single
palloc0 call.  That has been backed out while I research it.

>
> > I made a new version that did the alignment test of start and len in one
> > pass:
>
> I thought the point of this exercise was to eliminate the alignment test
> altogether, by making it doable at compile time.  IIRC, the upshot of
> the prior discussion was to keep the MemSet calls at the call site and
> to eliminate the alignment test on the pointer by special-casing palloc0
> (essentially, wiring into the macro the assumption that palloc's
> returned pointer will be suitably aligned).

I didn't assume we could special case palloc0 by knowing it was going to
be properly aligned and be of the proper length (multiple if int).

I have a new idea.  Right now we have:

    #define palloc0(sz) MemoryContextAllocZero(CurrentMemoryContext, (sz))

My new idea is to add a third boolean argument to
MemoryContextAllocZero() which will control whether the MemSet
assignment loop is used, or memset().  My idea is to use "? :" to test
for the proper value in the macro.  Basically, this will decouple the
MemSet test and the MemSet loop, allowing the optimizer to process the
length at the call site, but allow palloc0 to perform the assignment
loop.  I need palloc0 because MemSet can't return a pointer.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: MemSet inline for newNode
Next
From: Tom Lane
Date:
Subject: Re: MemSet inline for newNode