Re: Better LWLocks with compare-and-swap (9.4) - Mailing list pgsql-hackers

From Daniel Farina
Subject Re: Better LWLocks with compare-and-swap (9.4)
Date
Msg-id CAAZKuFZcepFPVievRkcEDYwehZXL6EfnCbTpc_fM+xdqF17s9g@mail.gmail.com
Whole thread Raw
In response to Better LWLocks with compare-and-swap (9.4)  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: Better LWLocks with compare-and-swap (9.4)  (Daniel Farina <daniel@heroku.com>)
Re: Better LWLocks with compare-and-swap (9.4)  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
On Mon, May 13, 2013 at 5:50 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> pgbench -S is such a workload. With 9.3beta1, I'm seeing this profile, when
> I run "pgbench -S -c64 -j64 -T60 -M prepared" on a 32-core Linux machine:
>
> -  64.09%  postgres  postgres           [.] tas
>    - tas
>       - 99.83% s_lock
>          - 53.22% LWLockAcquire
>             + 99.87% GetSnapshotData
>          - 46.78% LWLockRelease
>               GetSnapshotData
>             + GetTransactionSnapshot
> +   2.97%  postgres  postgres           [.] tas
> +   1.53%  postgres  libc-2.13.so       [.] 0x119873
> +   1.44%  postgres  postgres           [.] GetSnapshotData
> +   1.29%  postgres  [kernel.kallsyms]  [k] arch_local_irq_enable
> +   1.18%  postgres  postgres           [.] AllocSetAlloc
> ...
>
> So, on this test, a lot of time is wasted spinning on the mutex of
> ProcArrayLock. If you plot a graph of TPS vs. # of clients, there is a
> surprisingly steep drop in performance once you go beyond 29 clients
> (attached, pgbench-lwlock-cas-local-clients-sets.png, red line). My theory
> is that after that point all the cores are busy, and processes start to be
> sometimes context switched while holding the spinlock, which kills
> performance.

I have, I also used linux perf to come to this conclusion, and my
determination was similar: a system was undergoing increasingly heavy
load, in this case with processes >> number of processors.  It was
also a phase-change type of event: at one moment everything would be
going great, but once a critical threshold was hit, s_lock would
consume enormous amount of CPU time.  I figured preemption while in
the spinlock was to blame at the time, given the extreme nature.



pgsql-hackers by date:

Previous
From: Jon Nelson
Date:
Subject: Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Next
From: Peter Geoghegan
Date:
Subject: Re: Better handling of archive_command problems