Re: disk performance benchmarks - Mailing list pgsql-general

From Jeffrey W. Baker
Subject Re: disk performance benchmarks
Date
Msg-id 1095185498.5792.12.camel@localhost.localdomain
Whole thread Raw
In response to Re: disk performance benchmarks  (Vivek Khera <khera@kcilink.com>)
Responses Re: disk performance benchmarks
Re: disk performance benchmarks
List pgsql-general
On Tue, 2004-09-14 at 10:28, Vivek Khera wrote:
> >>>>> "SW" == Shane Wright <Shane> writes:
>
> SW> But, we have now taken the plunge and I'm in a position to do some
> SW> benchmarking to actually get some data.  Basically I was wondering if
> SW> anyone else had any particular recommendations (or requests) about the
> SW> most useful kinds of benchmarks to do.
>
> I did a bunch of benchmarking on a 14 disk SCSI RAID array comparing
> RAID 5, 10, and 50.  My tests consisted of doing a full restore of a
> 30Gb database (including indexes) and comparing the times to do the
> restore, the time to make the indexes, and the time to vacuum.  Then I
> ran a bunch of queries.
>
> It was damn near impossible to pick a 'better' RAID config, so I just
> went with RAID5.
>
> You can find many of my posts on this topic on the list archives from
> about august - october of last year.
>
> Basically, you have to approach it holistically to tune the system: Pg
> config parameters, memory, and disk speed are the major factors.
>
> That and your schema needs to be not idiotic. :-)

I've recently bee frustrated by this topic, because it seems like you
can design the hell out of a system, getting everything tuned with micro
and macro benchmarks, but when you put it in production the thing falls
apart.

Current issue:

A dual 64-bit Opteron 244 machine with 8GB main memory, two 4-disk RAID5
arrays (one for database, one for xlogs).  PG's config is extremely
generous, and in isolated benchmarks it's very fast.

But, in reality, performance is abyssmal.  There's something about what
PG does inside commits and checkpoints that sends Linux into a catatonic
state.  For instance here's a snapshot of vmstat during a parallel heavy
select/insert load:

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 3  0    216  13852  39656 7739724    0    0   820  2664 2868  2557 16  2 74  7
 0  0    216  17580  39656 7736460    0    0  3024  4700 3458  4313 42  6 52  0
 0  0    216  16428  39676 7737324    0    0   840  4248 3930  4516  0  4 89  8
 0  1    216  18620  39672 7736920    0    0  7576   516 2738  3347  1  4 55 39
 0  0    216  14972  39672 7738960    0    0  1992  2532 2509  2288  2  3 93  3
 0  0    216  13564  39672 7740592    0    0  1640  2656 2581  2066  1  3 97  0
 0  0    216  12028  39672 7742292    0    0  1688  3576 2072  1626  1  2 96  0
 0  0    216  18364  39680 7736164    0    0  1804  3372 1836  1379  1  4 96  0
 0  0    216  16828  39684 7737588    0    0  1432  2756 2256  1720  1  3 94  2
 0  0    216  15452  39684 7738812    0    0  1188  2184 2384  1830  1  2 97  0
 0  1    216  15388  39684 7740104    0    0  1336  2628 2490  1974  2  3 94  2
 6  0    216  15424  39684 7740240    0    0   104  3472 2757  1940  3  2 92  2
 0  0    216  14784  39700 7741856    0    0  1668  3320 2718  2332  0  3 97  0

You can see there's not much progress being made there.   In the
presence of a farily pathetic writeout, there's a tiny trickle of disk
reads, userspace isn't making any progress, the kernel isn't busy, and
few processes are in iowait.  So what the heck is going on?

This state of non-progress persists as long as the checkpoint subprocess
is active.  I'm sure there's some magic way to improve this but I
haven't found it yet.

PS this is with Linux 2.6.7.

Regards,
jwb

pgsql-general by date:

Previous
From: Greg Stark
Date:
Subject: Re: Mail delivery failed: returning message to sender
Next
From: Ying Lu
Date:
Subject: Changed a column type from "integer" to varchar