I wrote:
> I'll do tomorrow morning (CEST, i.e. in about 11 hours).
These are the tests with the change:
> if ((--spins % MAX_SPINS_PER_DELAY) == 0)
>
> to
>
> if (--spins == 0)
I have called the resulting patch (spin-delay + this change) spin-delay-2.
again with only slock-no-cmpb applied
1: 55s 4: 119s (cpu ~97%)
with only spin-delay-2 applied
1: 56s 4: 125s (cpu ~99.5%)
with slock-no-cmpb and spin-delay-2 applied
1: 56s 4: 125s (cpu ~100%)
(Note that the cpu averages are not calulated but estimated by looking at
vmstat.)
Yesterdays results:
> CVS tip from 2005-09-12 ~16:00
> 1: 57s 2: 82s 4: 124s 8: 237s
>
> with only slock-no-cmpb.patch applied
> 1: 55s 2: 79s 4: 119s 8: 229s
>
> with only spin-delay.patch applied
> 1: 56s 2: 79s 4: 124s 8: 235s
>
> with both patches applied
> 1: 55s 2: 78s 4: 124s 8: 235s
To have other data, I have retested the patches on a single-cpu Intel P4
3GHz w/ HT (i.e. 2 virtual cpus), no EM64T. Comparing to the 2,4 dual-Xeon
results it's clear that this is in reality only one cpu. While the runtime
for N=1 is better than the other system, for N=4 it's already worse. The
situation with the patches is quite different, though. Unfortunatly.
CVS tip from 2005-09-12:
1: 36s 2: 77s (cpu ~85%) 4: 159s (cpu ~98%)
only slock-no-cmpb:
1: 36s 2: 81s (cpu ~79%) 4: 177s (cpu ~94%)
(doesn't help this time)
only spin-delay:
1: 36s 2: 86s (cpu =100%) 4: 157s (cpu =100%)
(bad runtime for N=2 (repeatable), cpu not doing real work here?)
slock-no-cmpb and spin-delay:
1: 36s 2: 106s (cpu =100%) 4: 192s (cpu =100%)
(it gets worse)
only spin-delay-2:
1: 36s 2: 85s (cpu =100%) 4: 160s (cpu =100%)
(quite the same as spin-delay)
slock-no-cmpb and spin-delay-2:
1: 36s 2: 109s (cpu =100%) 4: 198s (cpu =100%)
(worse again)
CS rate was low (20-50) for all tests, increasing for N>2 which has to be
expected. For this system I see no case for applying these patches.
Is there a portable way to detect the CPU we are running on? Do you have any
other idea how to implement the delays?
Best Regards,
Michael Paesold