Re: linux distro for better pg performance - Mailing list pgsql-performance
From | Aaron Werman |
---|---|
Subject | Re: linux distro for better pg performance |
Date | |
Msg-id | BAY18-DAV5o8C7wyfwj0000292d@hotmail.com Whole thread Raw |
In response to | linux distro for better pg performance (pginfo <pginfo@t1.unisoftbg.com>) |
List | pgsql-performance |
The comparison is actually dead on. If you have lots of write through / read behind cache, RAID 5 can run very quickly, until the write rate overwhelms the cache - at which point the 4 I/O per write / 2 per read stops it. This means that RAID 5 works, except when stressed, which is a bad paradigm. If you do streaming sequential writes on RAID5 on a 4 drive RAID5, 4 writes become: - read drive 1 for data - read drive 3 for parity - write changes to drive 1 - write changes to drive 3 - read drive 2 for data - read drive 4 for parity - write changes to drive 2 - write changes to drive 4 - read drive 3 for data - read drive 1 for parity - write changes to drive 3 - write changes to drive 1 - read drive 4 for data - read drive 2 for parity - write changes to drive 4 - write changes to drive 2 or drive 1: 2 reads, 2 writes drive 2: 2 reads, 2 writes drive 3: 2 reads, 2 writes drive 4: 2 reads, 2 writes in other words, evenly distributed 16 I/Os. These have to be ordered to be recoverable (otherwise the parity scheme is broken and you can't recover), and thus are quasi synchronous. The same on RAID 10 is - write changes to drive 1 - write copy of changes to drive 2 - write changes to drive 1 - write copy of changes to drive 2 - write changes to drive 1 - write copy of changes to drive 2 - write changes to drive 1 - write copy of changes to drive 2 or drive 1: 4 I/Os drive 2: 4 I/Os in other words 4 I/Os in parallel. There is no wait on streaming I/O on RAID 10, and this fact is the other main reason RAID 10 gives an order of magnitude better performance. If you are writing full blocks in a streaming mode, RAID 3 will be the fastest - it is RAID 0 with a parity drive. In every situation I've seen it, RAID 5 was either generally slow or got applications into trouble during stress: bulk loads, etc. Most DBAs end up on RAID 10 for it's predictability and performance. /Aaron ----- Original Message ----- From: "Alan Stange" <stange@rentec.com> To: "Joseph Shraibman" <jks@selectacast.net> Cc: "J. Andrew Rogers" <jrogers@neopolitan.com>; <pgsql-performance@postgresql.org> Sent: Monday, May 03, 2004 11:03 PM Subject: Re: [PERFORM] linux distro for better pg performance > Joseph Shraibman wrote: > > > J. Andrew Rogers wrote: > > > >> Do these features make a difference? Far more than you would > >> imagine. On one postgres server I just upgraded, we went from a 3Ware > >> 8x7200-RPM > >> RAID-10 configuration to an LSI 320-2 SCSI 3x10k RAID-5, with 256M > > > > Is raid 5 much faster than raid 10? On a 4 disk array with 3 data > > disks and 1 parity disk, you have to write 4/3rds the original data, > > while on raid 10 you have to write 2 times the original data, so > > logically raid 5 should be faster. > > I think this comparison is a bit simplistic. For example, most raid5 > setups have full stripes that are more than 8K (the typical IO size in > postgresql), so one might have to read in portions of the stripe in > order to compute the parity. The needed bits might be in some disk or > controller cache; if it's not then you lose. If one is able to > perform full stripe writes then the raid5 config should be faster for > writes. > > Note also that the mirror has 2 copies of the data, so that the read IOs > would be divided across 2 (or more) spindles using round robin or a more > advanced algorithm to reduce seek times. > > Of course, I might be completely wrong... > > -- Alan > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
pgsql-performance by date: