Re: spinlocks on powerpc - Mailing list pgsql-hackers

From Tom Lane
Subject Re: spinlocks on powerpc
Date
Msg-id 29367.1325480634@sss.pgh.pa.us
Whole thread Raw
In response to Re: spinlocks on powerpc  (Manabu Ori <manabu.ori@gmail.com>)
Responses Re: spinlocks on powerpc  (Manabu Ori <manabu.ori@gmail.com>)
Re: spinlocks on powerpc  (Manabu Ori <manabu.ori@gmail.com>)
Re: spinlocks on powerpc  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Manabu Ori <manabu.ori@gmail.com> writes:
> I recreated the patch as you advised.

Hmm, guess I wasn't clear --- we still need a configure test, since even
if we are on PPC64 there's no guarantee that the assembler will accept
the hint bit.

I revised the patch to include a configure test and committed it.
However, I omitted the part that added an unlocked test in TAS_SPIN,
because (1) that's logically a separate change, and (2) in my testing
the unlocked test produces a small but undeniable performance loss
(see numbers below).  We need to investigate a bit more to understand
why I'm getting results different from yours.  If the bottom line is
that the unlocked test loses for smaller numbers of processors and only
helps with lots of them, I have to question whether it's a good idea to
apply it.

>> Shouldn't we just make slock_t be "int" for both PPC and PPC64?

> I'd like it to be untouched for this TAS_SPIN for powerpc
> discussion, since it seems it remainds like this for several
> years and maybe it needs some more careful consideration

I ran a test and could not see any consistent performance difference
between 4-byte and 8-byte slock_t, so I've committed that change too.
Obviously that can be revisited if anyone comes up with evidence in
the other direction.

While I was looking at this, I noticed that PPC ISA 2.03 and later
recommend use of "lwsync" rather than "isync" and "sync" in lock
acquisition and release, and sure enough I can measure improvement
from making that change too.  So again the problem is to know whether
it's safe to use that instruction.  Googling shows that there's at
least one current 32-bit PPC chip that gives SIGILL (Freescale's
E500 ... thanks for nothing, Freescale ...); but at least some projects
are using 64-bitness as a proxy test for whether it's safe to use
lwsync.  So for the moment I've also committed a patch that switches
to using lwsync on PPC64.  We can perhaps improve on that too, but
it's got basically the same issues as the hint bit with respect to
how to know at compile time whether the instruction is safe at run time.

I would be interested to see results from your 750 Express machine
as to the performance impact of each of these successive patches,
and then perhaps the TAS_SPIN change on top of that.

While working on this, I repeated the tests I did in
http://archives.postgresql.org/message-id/8292.1314641721@sss.pgh.pa.us

With current git head, I get:

pgbench -c 1 -j 1 -S -T 300 bench    tps = 8703.264346 (including connections establishing)
pgbench -c 2 -j 1 -S -T 300 bench    tps = 12207.827348 (including connections establishing)
pgbench -c 8 -j 4 -S -T 300 bench    tps = 48593.999965 (including connections establishing)
pgbench -c 16 -j 8 -S -T 300 bench    tps = 91155.555180 (including connections establishing)
pgbench -c 32 -j 16 -S -T 300 bench    tps = 124648.093971 (including connections establishing)
pgbench -c 64 -j 32 -S -T 300 bench    tps = 129488.449355 (including connections establishing)
pgbench -c 96 -j 48 -S -T 300 bench    tps = 124958.553086 (including connections establishing)
pgbench -c 128 -j 64 -S -T 300 bench    tps = 134195.370726 (including connections establishing)

(It's depressing that these numbers have hardly moved since August ---
at least on this test, the work that Robert's done has not made any
difference.)  These numbers are repeatable in the first couple of
digits, but there's some noise in the third digit.

With your patch (hint bit and TAS_SPIN change) I get:

pgbench -c 1 -j 1 -S -T 300 bench    tps = 8751.930270 (including connections establishing)
pgbench -c 2 -j 1 -S -T 300 bench    tps = 12211.160964 (including connections establishing)
pgbench -c 8 -j 4 -S -T 300 bench    tps = 48608.877131 (including connections establishing)
pgbench -c 16 -j 8 -S -T 300 bench    tps = 90827.234014 (including connections establishing)
pgbench -c 32 -j 16 -S -T 300 bench    tps = 123267.062954 (including connections establishing)
pgbench -c 64 -j 32 -S -T 300 bench    tps = 128951.585059 (including connections establishing)
pgbench -c 96 -j 48 -S -T 300 bench    tps = 126551.870909 (including connections establishing)
pgbench -c 128 -j 64 -S -T 300 bench    tps = 133311.793735 (including connections establishing)

With the TAS_SPIN change only, no hint bit:

pgbench -c 1 -j 1 -S -T 300 bench    tps = 8764.703599 (including connections establishing)
pgbench -c 2 -j 1 -S -T 300 bench    tps = 12163.321040 (including connections establishing)
pgbench -c 8 -j 4 -S -T 300 bench    tps = 48580.673497 (including connections establishing)
pgbench -c 16 -j 8 -S -T 300 bench    tps = 90672.227488 (including connections establishing)
pgbench -c 32 -j 16 -S -T 300 bench    tps = 121604.634146 (including connections establishing)
pgbench -c 64 -j 32 -S -T 300 bench    tps = 129088.387379 (including connections establishing)
pgbench -c 96 -j 48 -S -T 300 bench    tps = 126291.854733 (including connections establishing)
pgbench -c 128 -j 64 -S -T 300 bench    tps = 133386.394587 (including connections establishing)

With the hint bit only:

pgbench -c 1 -j 1 -S -T 300 bench    tps = 8737.175661 (including connections establishing)
pgbench -c 2 -j 1 -S -T 300 bench    tps = 12208.715035 (including connections establishing)
pgbench -c 8 -j 4 -S -T 300 bench    tps = 48655.181597 (including connections establishing)
pgbench -c 16 -j 8 -S -T 300 bench    tps = 91365.152223 (including connections establishing)
pgbench -c 32 -j 16 -S -T 300 bench    tps = 124867.442389 (including connections establishing)
pgbench -c 64 -j 32 -S -T 300 bench    tps = 129795.644368 (including connections establishing)
pgbench -c 96 -j 48 -S -T 300 bench    tps = 126515.035998 (including connections establishing)
pgbench -c 128 -j 64 -S -T 300 bench    tps = 134551.825202 (including connections establishing)

So it looks to me like the TAS_SPIN change is a loser either by itself
or with the hint bit.

I then tried substituting LWSYNC for SYNC in S_UNLOCK(lock):

pgbench -c 1 -j 1 -S -T 300 bench    tps = 8807.059517 (including connections establishing)
pgbench -c 2 -j 1 -S -T 300 bench    tps = 12204.028897 (including connections establishing)
pgbench -c 8 -j 4 -S -T 300 bench    tps = 49051.003729 (including connections establishing)
pgbench -c 16 -j 8 -S -T 300 bench    tps = 91904.111604 (including connections establishing)
pgbench -c 32 -j 16 -S -T 300 bench    tps = 125049.367820 (including connections establishing)
pgbench -c 64 -j 32 -S -T 300 bench    tps = 130259.490608 (including connections establishing)
pgbench -c 96 -j 48 -S -T 300 bench    tps = 125581.037607 (including connections establishing)
pgbench -c 128 -j 64 -S -T 300 bench    tps = 135143.761032 (including connections establishing)

and finally LWSYNC for both SYNC and ISYNC:

pgbench -c 1 -j 1 -S -T 300 bench    tps = 8865.769698 (including connections establishing)
pgbench -c 2 -j 1 -S -T 300 bench    tps = 12278.078258 (including connections establishing)
pgbench -c 8 -j 4 -S -T 300 bench    tps = 49172.415634 (including connections establishing)
pgbench -c 16 -j 8 -S -T 300 bench    tps = 92229.289211 (including connections establishing)
pgbench -c 32 -j 16 -S -T 300 bench    tps = 125466.790383 (including connections establishing)
pgbench -c 64 -j 32 -S -T 300 bench    tps = 129422.631959 (including connections establishing)
pgbench -c 96 -j 48 -S -T 300 bench    tps = 124240.447533 (including connections establishing)
pgbench -c 128 -j 64 -S -T 300 bench    tps = 133892.054917 (including connections establishing)

That last is clearly a winner for reasonable numbers of processes,
so I committed it that way, but I'm a little worried by the fact that it
looks like it might be a marginal loss when the system is overloaded.
I would like to see results from your machine.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: PL/pgSQL return value in after triggers
Next
From: "chris r."
Date:
Subject: Re: SEGFAULT on SELECT * FROM view