Re: MemSet inline for newNode - Mailing list pgsql-patches
From | Bruce Momjian |
---|---|
Subject | Re: MemSet inline for newNode |
Date | |
Msg-id | 200211110336.gAB3aDw10925@candle.pha.pa.us Whole thread Raw |
In response to | Re: MemSet inline for newNode (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: MemSet inline for newNode
|
List | pgsql-patches |
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Tom Lane wrote: > >> But with any reasonably smart compiler, those *will* be the same because > >> the compiler can do constant-folding (ie, it knows i is 256 when control > >> reaches the MemSet). Pushing the MemSet into MemoryContextAlloc > >> eliminates the possibility of knowing the size argument's value. > > > Yes, however, I am not seeing that getting optimized with gcc 2.95 -O2. > > You aren't? I just checked gcc 2.95.3 on HPPA; it definitely propagates > the constant at -O2. I am definately seeing different timings using the constant and the variable in the test. > > > Well, the palloc0 use by newNode was a performance boost, as tested by > > Neil Conway. Without palloc0, there was no way to inline newNode. > > But your patch as committed does NOT inline newNode, in fact it does the > exact opposite: the MemSet macro is removed from the callsite. Yes, there were actually two patches; the first inlined newNode by calling palloc0. Without palloc, there was no way to inline newNode. The second was a more general one to merge palloc/MemSet into a single palloc0 call. That has been backed out while I research it. > > > I made a new version that did the alignment test of start and len in one > > pass: > > I thought the point of this exercise was to eliminate the alignment test > altogether, by making it doable at compile time. IIRC, the upshot of > the prior discussion was to keep the MemSet calls at the call site and > to eliminate the alignment test on the pointer by special-casing palloc0 > (essentially, wiring into the macro the assumption that palloc's > returned pointer will be suitably aligned). I didn't assume we could special case palloc0 by knowing it was going to be properly aligned and be of the proper length (multiple if int). I have a new idea. Right now we have: #define palloc0(sz) MemoryContextAllocZero(CurrentMemoryContext, (sz)) My new idea is to add a third boolean argument to MemoryContextAllocZero() which will control whether the MemSet assignment loop is used, or memset(). My idea is to use "? :" to test for the proper value in the macro. Basically, this will decouple the MemSet test and the MemSet loop, allowing the optimizer to process the length at the call site, but allow palloc0 to perform the assignment loop. I need palloc0 because MemSet can't return a pointer. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgsql-patches by date: