Re: FunctionCallN improvement. - Mailing list pgsql-hackers

From Darcy Buskermolen
Subject Re: FunctionCallN improvement.
Date
Msg-id 200502011410.35048.darcy@wavefire.com
Whole thread Raw
In response to Re: FunctionCallN improvement.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On February 1, 2005 01:23 pm, Tom Lane wrote:
> a_ogawa <a_ogawa@hi-ho.ne.jp> writes:
> > I made the test program to measure the effect of this macro.
>
> Well, if we're going to be tense about this, let's actually be tense
> about it.  Your test program isn't a great model for what's going to
> happen in fmgr.c, because you've designed it so that Nargs cannot be
> known at compile time.  In the fmgr routines, Nargs is certainly a
> compile-time constant, and so implementations that can exploit that
> will have an advantage.
>
> Also, we can take advantage of some improvements in the MemSet macro
> family that occurred since fmgr.c was last rewritten.  I see no reason
> not to use MemSetLoop directly, since the fcinfo struct will have the
> correct size and correct alignment.
>
> In addition to your original macro, I tried two other variants: one
> that uses MemSetLoop with a loop length rounded to the next higher
> multiple of 4, and one that expects the argisnull settings to be written
> out directly, in the same style as is currently done in FunctionCall1
> and FunctionCall2.  (This amounts to unrolling the loop in the original
> macro; something that could be done by the compiler given a constant
> Nargs, but it seems not to be done by the compilers I tested.)
>
> I tested two cases: NARGS = 2, which is certainly the single most
> critical case, and NARGS = 5, which is probably the largest number
> of arguments that we really care too much about.  (You have to hand-edit
> the test program and recompile to adjust NARGS, since the point is to
> treat it as a compile-time constant.)
>
> Here are wall-clock timings on the architectures and compilers I have at
> hand:
>
> NARGS = 2
>         MemSetLoop    OrigMacro    SetMacro    Unrolled
>
> i386, gcc -O2    37.655s        6.411s        7.060s        6.362s
>
> i386, gcc -O6    35.420s        1.129s        1.814s        0.567s
>
> PPC, gcc -O2    54.033s        6.754s        11.138s        6.438s
>
> HPPA, gcc -O2    58.82s        10.38s        9.79s        7.85s
>
> HPPA, cc +O2    60.39s        13.43s        8.40s        7.31s
>
> NARGS = 5
>         MemSetLoop    OrigMacro    SetMacro    Unrolled
>
> i386, gcc -O2    37.566s        11.329s        7.688s        8.874s
>
> i386, gcc -O6    32.992s        5.928s        2.881s        0.566s
>
> PPC, gcc -O2    86.300s        19.048s        14.626s        8.751s
>
> HPPA, gcc -O2    58.28s        15.09s        13.42s        14.37s
>
> HPPA, cc +O2    58.23s        8.96s        12.88s        7.28s

I see simular comparitive times on an UltraSparc running Solaris.


>
> (I used different loop counts on the different machines to get similar
> overall times for the memset case; so it's OK to compare numbers across
> a row but not down a column.)
>
> Based on this I think we ought to go with the "unrolled" approach, ie,
> we'll create a macro to initialize the fixed fields of fcinfo but fill
> in the arg and argisnull arrays with code like what's already in
> FunctionCall2:
>
>     fcinfo.arg[0] = arg1;
>     fcinfo.arg[1] = arg2;
>     fcinfo.argnull[0] = false;
>     fcinfo.argnull[1] = false;
>
> If anyone would like to try the results on other platforms, my test
> program is attached.
>
>             regards, tom lane

-- 
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx:  250.763.1759
http://www.wavefire.com


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: float4 regression test failed on linux parisc
Next
From: Neil Conway
Date:
Subject: Re: [NOVICE] Last ID Problem