Re: linux distro for better pg performance - Mailing list pgsql-performance

From Aaron Werman
Subject Re: linux distro for better pg performance
Date
Msg-id BAY18-DAV5o8C7wyfwj0000292d@hotmail.com
Whole thread Raw
In response to linux distro for better pg performance  (pginfo <pginfo@t1.unisoftbg.com>)
List pgsql-performance
The comparison is actually dead on. If you have lots of write through / read
behind cache, RAID 5 can run very quickly, until the write rate overwhelms
the cache - at which point the 4 I/O per write / 2 per read stops it. This
means that RAID 5 works, except when stressed, which is a bad paradigm.

If you do streaming sequential writes on RAID5 on a 4 drive RAID5, 4 writes
become:

- read drive 1 for data
- read drive 3 for parity
- write changes to drive 1
- write changes to drive 3

- read drive 2 for data
- read drive 4 for parity
- write changes to drive 2
- write changes to drive 4

- read drive 3 for data
- read drive 1 for parity
- write changes to drive 3
- write changes to drive 1

- read drive 4 for data
- read drive 2 for parity
- write changes to drive 4
- write changes to drive 2

or

drive 1: 2 reads, 2 writes
drive 2: 2 reads, 2 writes
drive 3: 2 reads, 2 writes
drive 4: 2 reads, 2 writes

in other words, evenly distributed 16 I/Os. These have to be ordered to be
recoverable (otherwise the parity scheme is broken and you can't recover),
and thus are quasi synchronous.

The same on RAID 10 is

- write changes to drive 1
- write copy of changes to drive 2
- write changes to drive 1
- write copy of changes to drive 2
- write changes to drive 1
- write copy of changes to drive 2
- write changes to drive 1
- write copy of changes to drive 2

or

drive 1: 4 I/Os
drive 2: 4 I/Os

in other words 4 I/Os in parallel. There is no wait on streaming I/O on RAID
10, and this fact is the other main reason RAID 10 gives an order of
magnitude  better performance.

If you are writing full blocks in a streaming mode, RAID 3 will be the
fastest - it is RAID 0 with a parity drive. In every situation I've seen it,
RAID 5 was either generally slow or got applications into trouble during
stress: bulk loads, etc. Most DBAs end up on RAID 10 for it's predictability
and performance.

/Aaron

----- Original Message -----
From: "Alan Stange" <stange@rentec.com>
To: "Joseph Shraibman" <jks@selectacast.net>
Cc: "J. Andrew Rogers" <jrogers@neopolitan.com>;
<pgsql-performance@postgresql.org>
Sent: Monday, May 03, 2004 11:03 PM
Subject: Re: [PERFORM] linux distro for better pg performance


> Joseph Shraibman wrote:
>
> > J. Andrew Rogers wrote:
> >
> >> Do these features make a difference?  Far more than you would
> >> imagine. On one postgres server I just upgraded, we went from a 3Ware
> >> 8x7200-RPM
> >> RAID-10 configuration to an LSI 320-2 SCSI 3x10k RAID-5, with 256M
> >
> > Is raid 5 much faster than raid 10?  On a 4 disk array with 3 data
> > disks and 1 parity disk, you have to write 4/3rds the original data,
> > while on raid 10 you have to write 2 times the original data, so
> > logically raid 5 should be faster.
>
> I think this comparison is a bit simplistic.   For example, most raid5
> setups have full stripes that are more than 8K  (the typical IO size in
> postgresql), so one might have to read in portions of the stripe in
> order to compute the parity.   The needed bits might be in some disk or
> controller cache;  if it's not then you lose.   If one is able to
> perform full stripe writes then the raid5 config should be faster for
> writes.
>
> Note also that the mirror has 2 copies of the data, so that the read IOs
> would be divided across 2 (or more) spindles using round robin or a more
> advanced algorithm to reduce seek times.
>
> Of course, I might be completely wrong...
>
> -- Alan
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

pgsql-performance by date:

Previous
From: Mark Kirkwood
Date:
Subject: Re: Fwd: FreeBSD, PostgreSQL, semwait and sbwait!
Next
From: "scott.marlowe"
Date:
Subject: Re: cache table