Re: Reduce ProcArrayLock contention - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Reduce ProcArrayLock contention
Date
Msg-id CAA4eK1LH2OgmX4ZaC6Zn8yhm4_evUD8vkSsZxrH0igsAOdWd1Q@mail.gmail.com
Whole thread Raw
In response to Re: Reduce ProcArrayLock contention  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Tue, Aug 25, 2015 at 5:21 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Aug 20, 2015 at 3:49 PM, Andres Freund <andres@anarazel.de> wrote:
> > How hard did you try checking whether this causes regressions? This
> > increases the number of atomics in the commit path a fair bit. I doubt
> > it's really bad, but it seems like a good idea to benchmark something
> > like a single full-throttle writer and a large number of readers.
>
> One way to test this is run pgbench read load (with 100 client count) and
> write load (tpc-b - with one client) simultaneously and check the results.
> I have tried this and there is lot of variation(more than 50%) in tps in
> different runs  of write load, so not sure if this is the right way to
> benchmark it.
>
> Another possible way is to hack pgbench code and make one thread run
> write transaction and others run read transactions.

I have hacked pgbench to achieve single-writer-multi-reader test and below
are results:

M/c Configuration
-----------------------------
IBM POWER-8 24 cores, 192 hardware threads
RAM = 492GB

Non-default parameters
------------------------------------
max_connections = 150
shared_buffers=8GB
min_wal_size=10GB
max_wal_size=15GB
checkpoint_timeout    =30min
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
wal_buffers = 256MB

Data is for 3, 15 minutes pgbench (1-Writer, 127-Readers) test runs

Without ProcArrayLock optimization-
Commitid – 253de7e1
Client Count/No. Of Runs (tps)128
Run-1208011
Run-2471598
Run-3218295


With ProcArrayLock optimization -
Commitid – 0e141c0f
Client Count/No. Of Runs (tps)128
Run-1222839
Run-2469483
Run-3215791


It seems the test runs get dominated by I/O due to writer client which
leads to variation in performance numbers. In general, I don't see any
noticeable difference in performance with or without procarraylock
optimisation.  I have tried even by turning off synchronous_commit and
fsync, but the results are quite similar.

pgbench modifications
-----------------------------------
Introduced a new type of test run with -W option which means single
writer and multi-reader, example if user has given 128 clients and 128
threads, it will use 1-Thread for Write (Update) transaction and 127 for
Select Only transaction.  This works specifically for this use case as
I had no intention to make a generic test.  Please note, it will work properly
if number of clients and threads input by user are same.  Attached find
the pgbench patch, I have used for this test. Note that, I have used
-W option in pgbench run as mentioned in below steps.

Test steps for each Run
--------------------------------------------------------------------------------------------------------
1. Start Server
2. dropdb postgres
3. createdb posters
4. pgbench -i -s 300 postgres
5. pgbench -c $threads -j $threads -T 1800 -M prepared -W postgres
6. checkpoint
7. Stop Server



With Regards,
Amit Kapila.
Attachment

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: checkpointer continuous flushing
Next
From: Andres Freund
Date:
Subject: Re: What does RIR as in fireRIRrules stand for?