Re: FunctionCallN improvement. - Mailing list pgsql-hackers

From Tom Lane
Subject Re: FunctionCallN improvement.
Date
Msg-id 19054.1107293036@sss.pgh.pa.us
Whole thread Raw
In response to Re: FunctionCallN improvement.  (a_ogawa <a_ogawa@hi-ho.ne.jp>)
Responses Re: FunctionCallN improvement.
Re: FunctionCallN improvement.
Re: FunctionCallN improvement.
List pgsql-hackers
a_ogawa <a_ogawa@hi-ho.ne.jp> writes:
> I made the test program to measure the effect of this macro.

Well, if we're going to be tense about this, let's actually be tense
about it.  Your test program isn't a great model for what's going to
happen in fmgr.c, because you've designed it so that Nargs cannot be
known at compile time.  In the fmgr routines, Nargs is certainly a
compile-time constant, and so implementations that can exploit that
will have an advantage.

Also, we can take advantage of some improvements in the MemSet macro
family that occurred since fmgr.c was last rewritten.  I see no reason
not to use MemSetLoop directly, since the fcinfo struct will have the
correct size and correct alignment.

In addition to your original macro, I tried two other variants: one
that uses MemSetLoop with a loop length rounded to the next higher
multiple of 4, and one that expects the argisnull settings to be written
out directly, in the same style as is currently done in FunctionCall1
and FunctionCall2.  (This amounts to unrolling the loop in the original
macro; something that could be done by the compiler given a constant
Nargs, but it seems not to be done by the compilers I tested.)

I tested two cases: NARGS = 2, which is certainly the single most
critical case, and NARGS = 5, which is probably the largest number
of arguments that we really care too much about.  (You have to hand-edit
the test program and recompile to adjust NARGS, since the point is to
treat it as a compile-time constant.)

Here are wall-clock timings on the architectures and compilers I have at
hand:

NARGS = 2
        MemSetLoop    OrigMacro    SetMacro    Unrolled

i386, gcc -O2    37.655s        6.411s        7.060s        6.362s

i386, gcc -O6    35.420s        1.129s        1.814s        0.567s

PPC, gcc -O2    54.033s        6.754s        11.138s        6.438s

HPPA, gcc -O2    58.82s        10.38s        9.79s        7.85s

HPPA, cc +O2    60.39s        13.43s        8.40s        7.31s

NARGS = 5
        MemSetLoop    OrigMacro    SetMacro    Unrolled

i386, gcc -O2    37.566s        11.329s        7.688s        8.874s

i386, gcc -O6    32.992s        5.928s        2.881s        0.566s

PPC, gcc -O2    86.300s        19.048s        14.626s        8.751s

HPPA, gcc -O2    58.28s        15.09s        13.42s        14.37s

HPPA, cc +O2    58.23s        8.96s        12.88s        7.28s

(I used different loop counts on the different machines to get similar
overall times for the memset case; so it's OK to compare numbers across
a row but not down a column.)

Based on this I think we ought to go with the "unrolled" approach, ie,
we'll create a macro to initialize the fixed fields of fcinfo but fill
in the arg and argisnull arrays with code like what's already in
FunctionCall2:

    fcinfo.arg[0] = arg1;
    fcinfo.arg[1] = arg2;
    fcinfo.argnull[0] = false;
    fcinfo.argnull[1] = false;

If anyone would like to try the results on other platforms, my test
program is attached.

            regards, tom lane

#include "postgres.h"
#include "fmgr.h"

#define NARGS 2                    /* Unrolled code can handle up to 10 */

/*
 * Initialize minimum fields of FunctionCallInfoData that must be
 * initialized.
 */
#define InitFunctionCallInfoData(Fcinfo, Flinfo, Nargs)              \
    do {                                                             \
        int     i_;                                                  \
        (Fcinfo)->flinfo = Flinfo;                                   \
        (Fcinfo)->context = NULL;                                    \
        (Fcinfo)->resultinfo = NULL;                                 \
        (Fcinfo)->isnull = false;                                    \
        (Fcinfo)->nargs = Nargs;                                     \
        for(i_ = 0; i_ < Nargs; i_++) (Fcinfo)->argnull[i_] = false; \
    } while(0)

/*
 * dummyFunc is to control excessive optimization.
 * When this function is not called from loop, the initialization of
 * FunctionCallInfoData might move outside of the loop by gcc.
 */
void dummyFunc(FunctionCallInfoData *fcinfo, int cnt)
{
    fcinfo->arg[0] = Int32GetDatum(cnt);
}

void TestMemSet(int cnt)
{
    FunctionCallInfoData fcinfo;

    printf("test MemSetLoop(%d): %d\n", NARGS, cnt);

    for(; cnt; cnt--) {
        MemSetLoop(&fcinfo, 0, sizeof(fcinfo));
        dummyFunc(&fcinfo, cnt);
    }
}

void TestOrigMacro(int cnt)
{
    FunctionCallInfoData fcinfo;

    printf("test OrigMacro(%d): %d\n", NARGS, cnt);

    for(; cnt; cnt--) {
        InitFunctionCallInfoData(&fcinfo, NULL, NARGS);
        dummyFunc(&fcinfo, cnt);
    }
}

#undef InitFunctionCallInfoData

#define InitFunctionCallInfoData(Fcinfo, Flinfo, Nargs)              \
    do {                                                             \
        (Fcinfo)->flinfo = Flinfo;                                   \
        (Fcinfo)->context = NULL;                                    \
        (Fcinfo)->resultinfo = NULL;                                 \
        (Fcinfo)->isnull = false;                                    \
        (Fcinfo)->nargs = Nargs;                                     \
        MemSetLoop((Fcinfo)->argnull, 0, \
                   sizeof(int32) * ((Nargs + sizeof(int32)-1) / sizeof(int32))); \
    } while(0)

void TestSetMacro(int cnt)
{
    FunctionCallInfoData fcinfo;

    printf("test SetMacro(%d): %d\n", NARGS, cnt);

    for(; cnt; cnt--) {
        InitFunctionCallInfoData(&fcinfo, NULL, NARGS);
        dummyFunc(&fcinfo, cnt);
    }
}

#undef InitFunctionCallInfoData

#define InitFunctionCallInfoData(Fcinfo, Flinfo, Nargs)              \
    do {                                                             \
        (Fcinfo)->flinfo = Flinfo;                                   \
        (Fcinfo)->context = NULL;                                    \
        (Fcinfo)->resultinfo = NULL;                                 \
        (Fcinfo)->isnull = false;                                    \
        (Fcinfo)->nargs = Nargs;                                     \
    } while(0)

void TestUnrolled(int cnt)
{
    FunctionCallInfoData fcinfo;

    printf("test Unrolled(%d): %d\n", NARGS, cnt);

    for(; cnt; cnt--) {
        InitFunctionCallInfoData(&fcinfo, NULL, NARGS);
#if NARGS > 0
        fcinfo.argnull[0] = false;
#endif
#if NARGS > 1
        fcinfo.argnull[1] = false;
#endif
#if NARGS > 2
        fcinfo.argnull[2] = false;
#endif
#if NARGS > 3
        fcinfo.argnull[3] = false;
#endif
#if NARGS > 4
        fcinfo.argnull[4] = false;
#endif
#if NARGS > 5
        fcinfo.argnull[5] = false;
#endif
#if NARGS > 6
        fcinfo.argnull[6] = false;
#endif
#if NARGS > 7
        fcinfo.argnull[7] = false;
#endif
#if NARGS > 8
        fcinfo.argnull[8] = false;
#endif
#if NARGS > 9
        fcinfo.argnull[9] = false;
#endif
        dummyFunc(&fcinfo, cnt);
    }
}

int main(int argc, char **argv)
{
    int     test_cnt;

    if(argc != 3) {
        printf("usage: fmgrtest -memset|-origmacro|-setmacro|-unrolled test_cnt\n");
        return 1;
    }
    test_cnt = atoi(argv[2]);

    if(strcmp(argv[1], "-memset") == 0) TestMemSet(test_cnt);
    if(strcmp(argv[1], "-origmacro") == 0) TestOrigMacro(test_cnt);
    if(strcmp(argv[1], "-setmacro") == 0) TestSetMacro(test_cnt);
    if(strcmp(argv[1], "-unrolled") == 0) TestUnrolled(test_cnt);

    return 0;
}

pgsql-hackers by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: Huge memory consumption during vacuum (v.8.0)
Next
From: Tom Lane
Date:
Subject: Re: Huge memory consumption during vacuum (v.8.0)