Re: update i386 spinlock for hyperthreading - Mailing list pgsql-patches

From Tom Lane
Subject Re: update i386 spinlock for hyperthreading
Date
Msg-id 13783.1072559515@sss.pgh.pa.us
Whole thread Raw
In response to Re: update i386 spinlock for hyperthreading  (Manfred Spraul <manfred@colorfullife.com>)
Responses Re: update i386 spinlock for hyperthreading
Re: update i386 spinlock for hyperthreading
List pgsql-patches
Manfred Spraul <manfred@colorfullife.com> writes:
> My guess: Pentium 4 cpu support something like 250 uops in flight - it
> will have a dozend of the spinlock loops in it's pipeline. When the
> spinlock is released, it must figure out which of the loops should get
> it, and gets lost. My guess is that rep;nop delays the cpu buy at least
> 100 cpu ticks, and thus the pipeline will be empty before it proceeds.

After digging some more in Intel's documentation, it seems that indeed
PAUSE is defined to delay just long enough to empty the pipeline.  So it
doesn't really matter where you put it in the wait loop, and there is no
point in inserting it in the success path; that answers my concerns from
before.

> There was a w_spinlock.pdf document with reference code. google still
> finds it, but the links are dead :-(

I was able to find it as a link from another application note at Intel's
documentation site.  Try going to
http://appzone.intel.com/literature/index.asp and searching for AP-949.


Anyway, I've committed your patch with some changes.

> The 2nd thing I would change is to add a nonatomic test in the slow
> path: locked instructions generate lots of bus traffic, and that's a
> waste of resources.

Agreed, but I did not like the way you did it; this concern does not
necessarily apply to all processors, and since we are not using
S_LOCK_FREE at all, it's dubious that it's correctly implemented
everywhere.  I modified the IA32 TAS() macro instead.

BTW, I noticed a lot of concern in the Intel app notes about reserving
64 or even 128 bytes for each spinlock to avoid cache line conflicts.
That seems excessive to me (we use a lot of spinlocks for buffers), but
perhaps it is worth looking into.


> Is there an easy way find out which LWLock is contended?

Not from oprofile output, as far as I can think.  I've suspected for
some time that the BufMgrLock is a major bottleneck, but have no proof.

            regards, tom lane

pgsql-patches by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Quoting of psql \d output
Next
From: Manfred Spraul
Date:
Subject: Re: update i386 spinlock for hyperthreading