Re: LWLock contention: I think I understand the problem - Mailing list pgsql-hackers

From Tom Lane
Subject Re: LWLock contention: I think I understand the problem
Date
Msg-id 7758.1010186039@sss.pgh.pa.us
Whole thread Raw
In response to Re: LWLock contention: I think I understand the problem  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Responses Re: LWLock contention: I think I understand the problem  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: LWLock contention: I think I understand the problem  (Tatsuo Ishii <t-ishii@sra.co.jp>)
List pgsql-hackers
I have gotten my hands on a Linux 4-way SMP box (courtesy of my new
employer Red Hat), and have obtained pgbench results that look much
more promising than Tatsuo's.  It seems the question is not so much
"why is 7.2 bad?" as "why is it bad on AIX?"

The test machine has 4 550MHz Pentium III CPUs, 5Gb RAM, and a passel
of SCSI disks hanging off ultra-wide controllers.  It's presently
running Red Hat 7.1 enterprise release, kernel version 2.4.2-2enterprise
#1 SMP.  (Not the latest thing, but perhaps representative of what
people are running in production situations.  I can get it rebooted with
other kernel versions if anyone thinks the results will be interesting.)

For the tests, the postmasters were started with parameters
    postmaster -F -N 100 -B 3800
(the -B setting chosen to fit within 32Mb, which is the shmmax setting
on stock Linux).  -F is not very representative of production use,
but I thought it was appropriate since we are trying to measure CPU
effects not disk I/O.  pgbench scale factor is 50; xacts/client varied
so that each run executes 10000 transactions, per this script:

#! /bin/sh

DB=bench
totxacts=10000

for c in 1 2 3 4 5 6 10 25 50 100
do
        t=`expr $totxacts / $c`
        psql -c 'vacuum' $DB
        psql -c 'checkpoint' $DB
        echo "===== sync ======" 1>&2
        sync;sync;sync;sleep 10
        echo $c concurrent users... 1>&2
        pgbench -n -t $t -c $c $DB
done

The results are shown in the attached plot.  Interesting, hmm?
The "sweet spot" at 3 processes might be explained by assuming that
pgbench itself chews up the fourth CPU.

This still leaves me undecided whether to apply the first or second
version of the LWLock patch.

            regards, tom lane


Attachment

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: O_DIRECT use
Next
From: Matthew Kirkwood
Date:
Subject: Re: O_DIRECT use