* Sergey Koposov (koposov@ast.cam.ac.uk) wrote:
> I did a specific test with just 6 threads (== number of cores per cpu)
> and ran it on a single phys cpu, it took ~ 12 seconds for each thread,
> and when I tried to spread it across 4 cpus it took 7-9 seconds per
> thread. But all these numbers are anyway significantly better then
> when I didn't use taskset. Which probably means without it the
> processes were jumping from core to core ? ...
Oh, and wrt why 'cat' isn't really affected by this issue- that's
because 'cat' isn't *doing* anything, CPU wise, really. The PG backends
are actually doing real work and therefore they have both CPU state and
memory accesses which are impacted when the process is moved from one
core to another. If this system has NUMA (where memory is associated
with a set of cores), then that can make it more painful when threads
are being moved between cores also, because suddenly the memory you were
accessing (which was 'fast') is now taking longer because you're having
to go through another CPU to get to it.
Thanks,
Stephen