Re: inline newNode() - Mailing list pgsql-patches

From Neil Conway
Subject Re: inline newNode()
Date
Msg-id 87ptuin5wb.fsf@mailbox.samurai.com
Whole thread Raw
In response to Re: inline newNode()  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: inline newNode()
Re: inline newNode()
Re: inline newNode()
List pgsql-patches
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Remember, MemSet was invented only to prevent function call overhead,
> and on my BSD/OS system, len >= 256 is faster with the libc
> memset().

Yes, I remember finding that when testing MemSet() versus memset() for
various values of MEMSET_LOOP_LIMIT earlier.

> What really surprised me is that MemSet won on Sparc, where they have an
> assembler language version that looks very similar to the MemSet
> loop.

Well, I'd assume any C library / compiler of half-decent quality on
any platform would provide assembly optimized versions of common
stdlib functions like memset().

While playing around with memset() on my machine (P4 running Linux,
glibc 2.2.5, GCC 3.2.1pre3), I found the following interesting
result. I used this simple benchmark (the same one I posted for the
earlier MemSet() thread on -hackers):

#include <string.h>
#include "postgres.h"

#undef MEMSET_LOOP_LIMIT
#define MEMSET_LOOP_LIMIT BUFFER_SIZE

int
main(void)
{
    char buffer[BUFFER_SIZE];
    long long i;

    for (i = 0; i < 99000000; i++)
    {
        memset(buffer, 0, sizeof(buffer));
    }

    return 0;
}

Compiled with '-DBUFFER_SIZE=256 -O2', I get the following results in
seconds:

MemSet(): ~9.6
memset(): ~19.5
__builtin_memset(): ~10.00

So it seems there is a reasonably optimized version of memset()
provided by glibc/GCC (not sure which :-) ), it's just a matter of
persuading the compiler to let us use it. It's still depressing that
it doesn't beat MemSet(), but perhaps __builtin_memset() has better
average-case performane over a wider spectrum of memory size?[1]

BTW, regarding the newNode() stuff: so is it agreed that Bruce's patch
is a performance win without too high of a code bloat / uglification
penalty? If so, is it 7.3 or 7.4 material?

Cheers,

Neil

[1] Not that I really buy that -- for one thing, if the length is
constant, as it is in this case, the compiler can substitute an
optimized version of the function for the appropriate memory size. I'm
having a little difficulty explaining GCC/glibc's poor performance...

--
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC

pgsql-patches by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: inline newNode()
Next
From: Karel Zak
Date:
Subject: Re: inline newNode()