On Wed, 11 Jul 2007, Jim Nasby wrote:
> I suppose an entirely in-memory database might be able to swamp a 2
> drive WAL as well.
You can really generate a whole lot of WAL volume on an EMC SAN if you're
doing UPDATEs fast enough on data that is mostly in-memory. Takes a
fairly specific type of application to do that though, and whether you'll
ever find it outside of a benchmark is hard to say.
The main thing I would add as a consideration here is that you can
configure PostgreSQL to write WAL data using the O_DIRECT path, bypassing
the OS buffer cache, and greatly improve performance into SAN-grade
hardware like this. That can be a big win if you're doing writes that
dirty lots of WAL, and the benefit is straightforward to measure if the
WAL is a dedicated section of disk (just change the wal_sync_method and do
benchmarks with each setting). If the WAL is just another section on an
array, how well those synchronous writes will mesh with the rest of the
activity on the system is not as straightforward to predict. Having the
WAL split out provides a logical separation that makes figuring all this
out easier.
Just to throw out a slightly different spin on the suggestions going by
here: consider keeping the WAL separate, starting as a RAID-1 volume, but
keep 2 disks in reserve so that you could easily upgrade to a 0+1 set if
you end up discovering you need to double the write bandwidth. Since
there's never much actual data on the WAL disks that would a fairly short
downtime operation. If you don't reach a wall, the extra drives might
serve as spares to help mitigate concerns about the WAL drives burning out
faster than average because of their high write volume.
--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD