Re: Wierd context-switching issue on Xeon - Mailing list pgsql-performance
From | Dave Cramer |
---|---|
Subject | Re: Wierd context-switching issue on Xeon |
Date | |
Msg-id | 1082574808.1558.243.camel@localhost.localdomain Whole thread Raw |
In response to | Re: Wierd context-switching issue on Xeon (Paul Tuckfield <paul@tuckfield.com>) |
Responses |
Re: Wierd context-switching issue on Xeon patch for 7.4.1
|
List | pgsql-performance |
FYI, I am doing my testing on non hyperthreading dual athlons. Also, the test and set is attempting to set the same resource, and not simply a bit. It's really an lock;xchg in assemblelr. Also we are using the PAUSE mnemonic, so we should not be seeing any cache coherency issues, as the cache is being taken out of the picture AFAICS ? Dave On Wed, 2004-04-21 at 14:19, Paul Tuckfield wrote: > Dave: > > Why would test and set increase context swtches: > Note that it *does not increase* context swtiches when the two threads > are on the two cores of a single Xeon processor. (use taskset to force > affinity on linux) > > Scenario: > If the two test and set processes are testing and setting the same bit > as each other, then they'll see worst case cache coherency misses. > They'll ping a cache line back and forth between CPUs. Another case, > might be that they're tesing and setting different bits or words, but > those bits or words are always in the same cache line, again causing > worst case cache coherency and misses. The fact that tis doesn't > happen when the threads are bound to the 2 cores of a single Xeon > suggests it's because they're now sharing L1 cache. No pings/bounces. > > > I wonder do the threads stall so badly when pinging cache lines back > and forth, that the kernel sees it as an opportunity to put the > process to sleep? or do these worst case misses cause an interrupt? > > My question is: What is it that the two threads waiting for when they > spin? Is it exactly the same resource, or two resources that happen to > have test-and-set flags in the same cache line? > > On Apr 20, 2004, at 7:41 PM, Dave Cramer wrote: > > > I modified the code in s_lock.c to remove the spins > > > > #define SPINS_PER_DELAY 1 > > > > and it doesn't exhibit the behaviour > > > > This effectively changes the code to > > > > > > while(TAS(lock)) > > select(10000); // 10ms > > > > Can anyone explain why executing TAS 100 times would increase context > > switches ? > > > > Dave > > > > > > On Tue, 2004-04-20 at 12:59, Josh Berkus wrote: > >> Anjan, > >> > >>> Quad 2.0GHz XEON with highest load we have seen on the applications, > >>> DB > >>> performing great - > >> > >> Can you run Tom's test? It takes a particular pattern of data > >> access to > >> reproduce the issue. > > -- > > Dave Cramer > > 519 939 0336 > > ICQ # 14675561 > > > > > > ---------------------------(end of > > broadcast)--------------------------- > > TIP 8: explain analyze is your friend > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match > > > > !DSPAM:4086c4d0263544680737483! > > -- Dave Cramer 519 939 0336 ICQ # 14675561
pgsql-performance by date: