Re: CLOG contention - Mailing list pgsql-hackers

From Robert Haas
Subject Re: CLOG contention
Date
Msg-id CA+TgmoaH5Zn2k0=Ug33+k4Yxv5a-3ATjzwHtjhtV2hiSN0XyXg@mail.gmail.com
Whole thread Raw
In response to Re: CLOG contention  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Dec 21, 2011 at 12:48 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On the other hand, if we just want to avoid having more requests
> simultaneously in flight than we have buffers, so that backends don't
> need to wait for an available buffer before beginning their I/O, then
> something on the order of the number of CPUs in the machine is likely
> sufficient.  I'll do a little more testing and see if I can figure out
> where the tipping point is on this 32-core box.

I recompiled with NUM_CLOG_BUFFERS = 8, 16, 24, 32, 40, 48 and ran
5-minute tests, using unlogged tables to avoid getting killed by
WALInsertLock contentions.  With 32-clients on this 32-core box, the
tipping point is somewhere in the neighborhood of 32 buffers.  40
buffers might still be winning over 32, or maybe not, but 48 is
definitely losing.  Below 32, more is better, all the way up.  Here
are the full results:

resultswu.clog16.32.100.300:tps = 19549.454462 (including connections
establishing)
resultswu.clog16.32.100.300:tps = 19883.583245 (including connections
establishing)
resultswu.clog16.32.100.300:tps = 19984.857186 (including connections
establishing)
resultswu.clog24.32.100.300:tps = 20124.147651 (including connections
establishing)
resultswu.clog24.32.100.300:tps = 20108.504407 (including connections
establishing)
resultswu.clog24.32.100.300:tps = 20303.964120 (including connections
establishing)
resultswu.clog32.32.100.300:tps = 20573.873097 (including connections
establishing)
resultswu.clog32.32.100.300:tps = 20444.289259 (including connections
establishing)
resultswu.clog32.32.100.300:tps = 20234.209965 (including connections
establishing)
resultswu.clog40.32.100.300:tps = 21762.222195 (including connections
establishing)
resultswu.clog40.32.100.300:tps = 20621.749677 (including connections
establishing)
resultswu.clog40.32.100.300:tps = 20290.990673 (including connections
establishing)
resultswu.clog48.32.100.300:tps = 19253.424997 (including connections
establishing)
resultswu.clog48.32.100.300:tps = 19542.095191 (including connections
establishing)
resultswu.clog48.32.100.300:tps = 19284.962036 (including connections
establishing)
resultswu.master.32.100.300:tps = 18694.886622 (including connections
establishing)
resultswu.master.32.100.300:tps = 18417.647703 (including connections
establishing)
resultswu.master.32.100.300:tps = 18331.718955 (including connections
establishing)


Parameters in use: shared_buffers = 8GB, maintenance_work_mem = 1GB,
synchronous_commit = off, checkpoint_segments = 300,
checkpoint_timeout = 15min, checkpoint_completion_target = 0.9,
wal_writer_delay = 20ms

It isn't clear to me whether we can extrapolate anything more general
from this.  It'd be awfully interesting to repeat this experiment on,
say, an 8-core server, but I don't have one of those I can use at the
moment.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Greg Smith
Date:
Subject: Re: Page Checksums
Next
From: Robert Haas
Date:
Subject: Re: RangeVarGetRelid()