Re: [PATCHES] MemSet inline for newNode - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [PATCHES] MemSet inline for newNode |
Date | |
Msg-id | 200211102106.gAAL6ir09610@candle.pha.pa.us Whole thread Raw |
Responses |
Re: [PATCHES] MemSet inline for newNode
|
List | pgsql-hackers |
OK, my compiler, gcc 2.95 does the optimization at -O2, so I ran the tests here with the attached program, testing a constant of 256 and a variable 'j' equal to 256. I found a 1.7% slowdown with the variable compared to the constant. The 256 value is rather large. For a length of 32, the difference was 14%! So, seems I should back out the palloc0 usage and let the MemSet inline at each location. Does that seems like the direction to head? --------------------------------------------------------------------------- Bruce Momjian wrote: > Tom Lane wrote: > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > I have applied this patch to inline MemSet for newNode. > > > I will make another commit to make more general use of palloc0. > > > > Refresh my memory on why either of these was a good idea? > > You are talking about the use of palloc0 in place of palloc/MemSet(0), > not the use of palloc0 in newNode, right? > > > > This was already discussed on hackers. > > > > And this was not the approach agreed to, IIRC. What you've done has > > eliminated the possibility of optimizing away the controlling tests > > in MemSet. > > I see now: > > http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&selm=200210120006.g9C06i502964%40candle.pha.pa.us > > I thought someone had tested that there was little/no performance > difference between these two statements: > > MemSet(ptr, 0, 256) > > vs. > > i = 256; > MemSet(ptr, 0, i) > > However, seeing no email reply, I now assume no test was done. > > Neil, can you run such a test and let us know. It has to be with a > compiler that optimizes out the MemSet length test when the len is a > constant. I can't remember what compiler version/settings that was, but > it may have been -O3 gcc 3.20. > > I can back out my changes, but it would be easier to see if there is a > benefit before doing that. Personally, I am going to be surprised if a > single conditional tests in MemSet (which is not in the assignment loop) > will make any measurable difference. > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup. | Newtown Square, Pennsylvania 19073 > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 #define INT_ALIGN_MASK (sizeof(int) - 1) #include <stdlib.h> /* * MemSet * Exactly the same as standard library function memset(), but considerably * faster for zeroing small word-aligned structures (such as parsetree nodes). * This has to be a macro because the main point is to avoid function-call * overhead. However, we have also found that the loop is faster than * native libc memset() on some platforms, even those with assembler * memset() functions. More research needs to be done, perhaps with * platform-specific MEMSET_LOOP_LIMIT values or tests in configure. * * bjm 2002-10-08 */ #define MemSet(start, val, len) \ do \ { \ int * _start = (int *) (start); \ int _val = (val); \ size_t _len = (len); \ \ if ((((long) _start) & INT_ALIGN_MASK) == 0 && \ (_len & INT_ALIGN_MASK) == 0 && \ _val == 0 && \ _len <= MEMSET_LOOP_LIMIT) \ { \ int * _stop = (int *) ((char *) _start + _len); \ while (_start < _stop) \ *_start++ = 0; \ } \ else \ memset((char *) _start, _val, _len); \ } while (0) #define MEMSET_LOOP_LIMIT 1024 int main(int argc, char **argv) { int i,j; char *buf; buf = malloc(256); j = 32; for (i = 0; i < 1000000; i++) MemSet(buf, 0, j /* choose 256 or j here */); return 0; }
pgsql-hackers by date: