Home > mailing lists

Re: mosbench revisited - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: mosbench revisited
Date	August 4, 2011 01:16:23
Msg-id	CA+TgmoYeS+RgQvnQEYNpA7JCjjd_0SkjSwF29Lrsy+vkGxcvrQ@mail.gmail.com Whole thread Raw
In response to	Re: mosbench revisited (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: mosbench revisited
List	pgsql-hackers

Tree view

On Wed, Aug 3, 2011 at 5:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> That still seems utterly astonishing to me.  We're touching each of
> those files once per query cycle; a cycle that contains two message
> sends, who knows how many internal spinlock/lwlock/heavyweightlock
> acquisitions inside Postgres (some of which *do* contend with each
> other), and a not insignificant amount of plain old computing.
> Meanwhile, this particular spinlock inside the kernel is protecting
> what, a single doubleword fetch?  How is that the bottleneck?

Spinlocks seem to have a very ugly "tipping point".  When I tested
pgbench -S on a 64-core system with the lazy vxid patch applied and a
patch to use random_r() in lieu of random, the amount of system time
used per SELECT-only transaction at 48 clients was 3.59 times as much
as it was at 4 clients.  And the amount used per transaction at 52
clients was 3.63 times the amount used per transaction at 48 clients.
And the amount used at 56 clients was 3.25 times the amount used at 52
clients.  You can see the throughput graph starting to flatten out in
the 32-44 client range, but it's not particularly alarming.  However,
once you pass that point things rapidly get totally out of control in
a real hurry.  A few more clients and the machine is basically doing
nothing but spin.

> I am wondering whether kernel spinlocks are broken.

I don't think so.  Stefan Kaltenbrunner had one profile where he
showed something like sixty or eighty percent of the usermode CPU time
in s_lock.  I didn't have access to that particular hardware, but the
testing I've done strongly suggests that most of that was the
SInvalReadLock spinlock.  And before I patched pgbench to avoid
calling random(), that was doing the same thing - literally flattening
a 64-core box fighting over a single futex that normally costs almost
nothing.  (That one wasn't quite as bad because the futex actually
deschedules the waiters, but it was still bad.)  I'm actually not
really sure why it shakes out this way (birthday paradox?) but having
seen the effect several times now, I'm disinclined to believe it's an
artifact.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: jordani@go-link.net
Date: 04 August 2011, 01:09:23
Subject: Re: Incremental checkopints

From: Robert Haas
Date: 04 August 2011, 01:17:28
Subject: Re: Compressing the AFTER TRIGGER queue

Re: mosbench revisited - Mailing list pgsql-hackers

Previous

Next