Re: spinlocks on HP-UX - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: spinlocks on HP-UX
Date
Msg-id 20110906.173311.1317184787263658707.t-ishii@sraoss.co.jp
Whole thread Raw
In response to spinlocks on HP-UX  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: spinlocks on HP-UX
List pgsql-hackers
Hi,

I am interested in this thread because I may be able to borrow a big
IBM machine and might be able to do some tests on it if it somewhat
contributes enhancing PostgreSQL. Is there anything I can do for this?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> I was able to obtain access to a 32-core HP-UX server.  I repeated the
> pgbench -S testing that I have previously done on Linux, and found
> that the results were not too good.  Here are the results at scale
> factor 100, on 9.2devel, with various numbers of clients.  Five minute
> runs, shared_buffers=8GB.
> 
> 1:tps = 5590.070816 (including connections establishing)
> 8:tps = 37660.233932 (including connections establishing)
> 16:tps = 67366.099286 (including connections establishing)
> 32:tps = 82781.624665 (including connections establishing)
> 48:tps = 18589.995074 (including connections establishing)
> 64:tps = 16424.661371 (including connections establishing)
> 
> And just for comparison, here are the numbers at scale factor 1000:
> 
> 1:tps = 4751.768608 (including connections establishing)
> 8:tps = 33621.474490 (including connections establishing)
> 16:tps = 58959.043171 (including connections establishing)
> 32:tps = 78801.265189 (including connections establishing)
> 48:tps = 21635.234969 (including connections establishing)
> 64:tps = 18611.863567 (including connections establishing)
> 
> After mulling over the vmstat output for a bit, I began to suspect
> spinlock contention.  I took a look at document called "Implementing
> Spinlocks on the Intel Itanium Architecture and PA-RISC", by Tor
> Ekqvist and David Graves and available via the HP web site, which
> states that when spinning on a spinlock on these machines, you should
> use a regular, unlocked test first and use the atomic test only when
> the unlocked test looks OK.  I tried implementing this in two ways,
> and both produced results which are FAR superior to our current
> implementation.  First, I did this:
> 
> --- a/src/include/storage/s_lock.h
> +++ b/src/include/storage/s_lock.h
> @@ -726,7 +726,7 @@ tas(volatile slock_t *lock)
>  typedef unsigned int slock_t;
> 
>  #include <ia64/sys/inline.h>
> -#define TAS(lock) _Asm_xchg(_SZ_W, lock, 1, _LDHINT_NONE)
> +#define TAS(lock) (*(lock) ? 1 : _Asm_xchg(_SZ_W, lock, 1, _LDHINT_NONE))
> 
>  #endif /* HPUX on IA64, non gcc */
> 
> That resulted in these numbers.  Scale factor 100:
> 
> 1:tps = 5569.911714 (including connections establishing)
> 8:tps = 37365.364468 (including connections establishing)
> 16:tps = 63596.261875 (including connections establishing)
> 32:tps = 95948.157678 (including connections establishing)
> 48:tps = 90708.253920 (including connections establishing)
> 64:tps = 100109.065744 (including connections establishing)
> 
> Scale factor 1000:
> 
> 1:tps = 4878.332996 (including connections establishing)
> 8:tps = 33245.469907 (including connections establishing)
> 16:tps = 56708.424880 (including connections establishing)
> 48:tps = 69652.232635 (including connections establishing)
> 64:tps = 70593.208637 (including connections establishing)
> 
> Then, I did this:
> 
> --- a/src/backend/storage/lmgr/s_lock.c
> +++ b/src/backend/storage/lmgr/s_lock.c
> @@ -96,7 +96,7 @@ s_lock(volatile slock_t *lock, const char *file, int line)
>         int                     delays = 0;
>         int                     cur_delay = 0;
> 
> -       while (TAS(lock))
> +       while (*lock ? 1 : TAS(lock))
>         {
>                 /* CPU-specific delay each time through the loop */
>                 SPIN_DELAY();
> 
> That resulted in these numbers, at scale factor 100:
> 
> 1:tps = 5564.059494 (including connections establishing)
> 8:tps = 37487.090798 (including connections establishing)
> 16:tps = 66061.524760 (including connections establishing)
> 32:tps = 96535.523905 (including connections establishing)
> 48:tps = 92031.618360 (including connections establishing)
> 64:tps = 106813.631701 (including connections establishing)
> 
> And at scale factor 1000:
> 
> 1:tps = 4980.338246 (including connections establishing)
> 8:tps = 33576.680072 (including connections establishing)
> 16:tps = 55618.677975 (including connections establishing)
> 32:tps = 73589.442746 (including connections establishing)
> 48:tps = 70987.026228 (including connections establishing)
> 
> Note sure why I am missing the 64-client results for that last set of
> tests, but no matter.
> 
> Of course, we can't apply the second patch as it stands, because I
> tested it on x86 and it loses.  But it seems pretty clear we need to
> do it at least for this architecture...
> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


pgsql-hackers by date:

Previous
From: Yeb Havinga
Date:
Subject: Re: [v9.1] sepgsql - userspace access vector cache
Next
From: Marti Raudsepp
Date:
Subject: Re: Redundant bitmap index scans on smallint column