Re: SAN performance mystery - Mailing list pgsql-performance

From Tim Allen
Subject Re: SAN performance mystery
Date
Msg-id 449677EB.50609@proximity.com.au
Whole thread Raw
In response to Re: SAN performance mystery  (Scott Marlowe <smarlowe@g2switchworks.com>)
Responses Re: SAN performance mystery  (Michael Stone <mstone+postgres@mathom.us>)
Re: SAN performance mystery  (Stephen Frost <sfrost@snowman.net>)
List pgsql-performance
Scott Marlowe wrote:
> On Thu, 2006-06-15 at 16:50, Tim Allen wrote:
>
>>We have a customer who are having performance problems. They have a
>>large (36G+) postgres 8.1.3 database installed on an 8-way opteron with
>>8G RAM, attached to an EMC SAN via fibre-channel (I don't have details
>>of the EMC SAN model, or the type of fibre-channel card at the moment).
>>They're running RedHat ES3 (which means a 2.4.something Linux kernel).
>>
>>They are unhappy about their query performance. We've been doing various
>>things to try to work out what we can do. One thing that has been
>>apparent is that autovacuum has not been able to keep the database
>>sufficiently tamed. A pg_dump/pg_restore cycle reduced the total
>>database size from 81G to 36G. Performing the restore took about 23 hours.
>
> Do you have the ability to do any simple IO performance testing, like
> with bonnie++ (the old bonnie is not really capable of properly testing
> modern equipment, but bonnie++ will give you some idea of the throughput
> of the SAN)  Or even just timing a dd write to the SAN?

I've done some timed dd's. The timing results vary quite a bit, but it
seems you can write to the SAN at about 20MB/s and read from it at about
  12MB/s. Not an entirely scientific test, as I wasn't able to stop
other activity on the machine, though I don't think much else was
happening. Certainly not impressive figures, compared with our machine
with the SATA disk (referred to below), which can get 161MB/s copying
files on the same disk, and 48MB/s and 138Mb/s copying files from the
sata disk respectively to and from a RAID5 array.

The customer is a large organisation, with a large IT department who
guard their turf carefully, so there is no way I could get away with
installing any heavier duty testing tools like bonnie++ on their machine.

>>We tried restoring the pg_dump output to one of our machines, a
>>dual-core pentium D with a single SATA disk, no raid, I forget how much
>>RAM but definitely much less than 8G. The restore took five hours. So it
>>would seem that our machine, which on paper should be far less
>>impressive than the customer's box, does more than four times the I/O
>>performance.
>>
>>To simplify greatly - single local SATA disk beats EMC SAN by factor of
>>four.
>>
>>Is that expected performance, anyone? It doesn't sound right to me. Does
>>anyone have any clues about what might be going on? Buggy kernel
>>drivers? Buggy kernel, come to think of it? Does a SAN just not provide
>>adequate performance for a large database?

> Yes, this is not uncommon.  It is very likely that your SATA disk is
> lying about fsync.

I guess a sustained write will flood the disk's cache and negate the
effect of the write-completion dishonesty. But I have no idea how large
a copy would have to be to do that - can anyone suggest a figure?
Certainly, the read performance of the SATA disk still beats the SAN,
and there is no way to lie about read performance.

> What kind of backup are you using?  insert statements or copy
> statements?  If insert statements, then the difference is quite
> believable.  If copy statements, less so.

A binary pg_dump, which amounts to copy statements, if I'm not mistaken.

> Next time, on their big server, see if you can try a restore with fsync
> turned off and see if that makes the restore faster.  Note you should
> turn fsync back on after the restore, as running without it is quite
> dangerous should you suffer a power outage.
>
> How are you mounting to the EMC SAN?  NFS, iSCSI? Other?

iSCSI, I believe. Some variant of SCSI, anyway, of that I'm certain.

The conclusion I'm drawing here is that this SAN does not perform at all
well, and is not a good database platform. It's sounding from replies
from other people that this might be a general property of SAN's, or at
least the ones that are not stratospherically priced.

Tim

--
-----------------------------------------------
Tim Allen          tim@proximity.com.au
Proximity Pty Ltd  http://www.proximity.com.au/

pgsql-performance by date:

Previous
From: Tim Allen
Date:
Subject: Re: SAN performance mystery
Next
From: Michael Stone
Date:
Subject: Re: SAN performance mystery