Re: ATA disks and RAID controllers for database servers - Mailing list pgsql-general

From Mark Kirkwood
Subject Re: ATA disks and RAID controllers for database servers
Date
Msg-id 3FB57C5C.4070203@paradise.net.nz
Whole thread Raw
In response to ATA disks and RAID controllers for database servers  (Mark Kirkwood <markir@paradise.net.nz>)
Responses Re: ATA disks and RAID controllers for database servers
List pgsql-general
Dear all,

Here is the second installment concerning ATA disks and RAID controller
use in a database server.

In this post a 2 disk RAID0 configuration is tested, and the results
compared to the JBOD configuration in the previous message.

So again, what I was attempting to examine here was : is it feasable to
build a reasonably well performing database server using ATA disks? (in
particular would disabling the ATA write cache spoil performance
completely?)

The System
----------

Dell 410
2x700Mhz PIII 512Mb
Promise Fastrack TX2000 Controller
2x40G 7200RPM ATA-133 Maxtor Diamond +8 configured as JBOD or
2x40G 7200RPM ATA-133 Maxtor Diamond +8 configured as RAID0
Freebsd 4.8 (options SMP APIC_IO i686)
Postgresql 7.4beta2 (-O2 -funroll-loops -fexpensive-optimizations
-march=i686)
ATA Write caching controlled via the loader.conf variable hw.ata.wc (1 = on)


The Tests
---------

1. Sequential and random writes and reads of a file twice the size of memory

Files were written using read(2), write(2) functions - buffered at 8K.
For the random case 10% of the file was sampled using lseek(2), and read
or written. (see
http://techdocs.postgresql.org/markir/download/iowork/iowork-1.0.tar.gz)

The filesystem was built with newfs options :
 -U -b 32768 -f 4096 [ softupdates, 32K blocks, 4K fragments ]

The RAID0 strip size was 128K. This gave the best performance (32K, 64K
were tried - I got tired of rebuilding the system at this point, so 256K
and above may be better).


2. Postgresql pgbench benchmark program

This was run using the options :
 -t 1000       [ 1000 transactions ]
 -s 10         [ scale factor 10 ]
 -c 1,2,4,8,16 [ 1-16 clients ]

Non default postgresql.conf settings were:
 shared_buffers = 5000
 wal_buffers = 100
 checkpoint_segments = 10


A checkpoint was forced after each run to prevent cross run
interference. Three runs through were performed for each configuration,
and the results averaged. A new database was created for each 1-16
client "set" of runs.


Results
-------

Test 1

System IO Operation Throughput(M/s)  Options
------------------------------------------------
Dell
JBOD  seq write     11               hw.ata.wc=0
      seq read      50               hw.ata.wc=0
      random write  1.3              hw.ata.wc=0
      random read   4.2              hw.ata.wc=0

      seq write     20               hw.ata.wc=1
      seq read      53               hw.ata.wc=1
      random write  1.7              hw.ata.wc=1
      random read   4.1              hw.ata.wc=1

RAID0 seq write     13               hw.ata.wc=0
      seq read      100              hw.ata.wc=0
      random write  1.7              hw.ata.wc=0
      random read   4.2              hw.ata.wc=0

      seq write     38               hw.ata.wc=1
      seq read      100              hw.ata.wc=1
      random write  2.5              hw.ata.wc=1
      random read   4.3              hw.ata.wc=1

Test 2

System Clients      Throughput(tps)  Options
------------------------------------------------
Dell
JBOD  1             27               hw.ata.wc=0
      2             38               hw.ata.wc=0
      4             55               hw.ata.wc=0
      8             58               hw.ata.wc=0
      16            66               hw.ata.wc=0

      1             82               hw.ata.wc=1
      2             137              hw.ata.wc=1
      4             166              hw.ata.wc=1
      8             128              hw.ata.wc=1
      16            117              hw.ata.wc=1

RAID0 1             33               hw.ata.wc=0
      2             39               hw.ata.wc=0
      4             61               hw.ata.wc=0
      8             73               hw.ata.wc=0
      16            80               hw.ata.wc=0

      1             95               hw.ata.wc=1
      2             156              hw.ata.wc=1
      4             194              hw.ata.wc=1
      8             179              hw.ata.wc=1
      16            144              hw.ata.wc=1


Conclusions
-----------

Test 1

It is clear that with write caching on the RAID0 configuration greatly
improves sequential read and write performance - almost twice as fast as
the JBOD case. The random write performance is improved by a reasonable
factor too.

For write caching disabled, the write rates are similar to the JBOD
case. This *may* indicate some design issue in the Promise controller.


Test 2

For write caching on or off, the RAID0 configuration is faster - by
about 18 percent.


General

Clearly it is possible to obtain very good performance with write
caching on using RAID0, and if you have a UPS together with good backup
practice then this could be the way to go.

With caching off there is a considerable decrease in performance,
however this performance may be "good enough" if viewed in a
cost-benefit-safely manner.


Criticisms
----------

It would have been good to have two SCSI disks to test in the Dell
machine (as opposed to using a Sun 280R), unfortunately I can't justify
the cost of them for this test :-(. However there are some examples of
similar comparisons in the Postgresql General thread "Recomended FS"
(without an ATA RAID controller).


Mark




pgsql-general by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: More Praise for 7.4RC2
Next
From: Dustin Sallings
Date:
Subject: Re: embedded postgresql