Re: Postgresql and Software RAID/LVM - Mailing list pgsql-performance

From John A Meinel
Subject Re: Postgresql and Software RAID/LVM
Date
Msg-id 42A52465.2000302@arbash-meinel.com
Whole thread Raw
In response to Re: Postgresql and Software RAID/LVM  (Marty Scholes <marty@outputservices.com>)
List pgsql-performance
Marty Scholes wrote:
>> Has anyone ran Postgres with software RAID or LVM on a production box?
>> What have been your experience?
>
> Yes, we have run for a couple years Pg with software LVM (mirroring)
> against two hardware RAID5 arrays.  We host a production Sun box that
> runs 24/7.
>
> My experience:
> * Software RAID (other than mirroring) is a disaster waiting to happen.
>  If the metadata for the RAID set gives out for any reason (CMOS
> scrambles, card dies, power spike, etc.) then you are hosed beyond
> belief.  In most cases it is almost impossible to recover.  With
> mirroring, however, you can always boot and operate on a single mirror,
> pretending that no LVM/RAID is underway.  In other words, each mirror is
> a fully functional copy of the data which will operate your server.

Isn't this actually more of a problem for the meta-data to give out in a
hardware situation? I mean, if the card you are using dies, you can't
just get another one.
With software raid, because the meta-data is on the drives, you can pull
it out of that machine, and put it into any machine that has a
controller which can read the drives, and a similar kernel, and you are
back up and running.
>
> * Hardware RAID5 is a terrific way to boost performance via write
> caching and spreading I/O across multiple spindles.  Each of our
> external arrays operates 14 drives (12 data, 1 parity and 1 hot spare).
>  While RAID5 protects against single spindle failure, it will not hedge
> against multiple failures in a short time period, SCSI contoller
> failure, SCSI cable problems or even wholesale failure of the RAID
> controller.  All of these things happen in a 24/7 operation.  Using
> software RAID1 against the hardware RAID5 arrays hedges against any
> single failure.

No, it hedges against *more* than one failure. But you can also do a
RAID1 over a RAID5 in software. But if you are honestly willing to
create a full RAID1, just create a RAID1 over RAID0. The performance is
much better. And since you have a full RAID1, as long as both drives of
a pairing don't give out, you can lose half of your drives.

If you want the space, but you feel that RAID5 isn't redundant enough,
go to RAID6, which uses 2 parity locations, each with a different method
of storing parity, so not only is it more redundant, you have a better
chance of finding problems.

>
> * Software mirroring gives you tremendous ability to change the system
> while it is running, by taking offline the mirror you wish to change and
> then synchronizing it after the change.
>

That certainly is a nice ability. But remember that LVM also has the
idea of "snapshot"ing a running system. I don't know the exact details,
just that there is a way to have some processes see the filesystem as it
existed at an exact point in time. Which is also a great way to handle
backups.

> On a fully operational production server, we have:
> * restriped the RAID5 array
> * replaced all RAID5 media with higher capacity drives
> * upgraded RAID5 controller
> * moved all data from an old RAID5 array to a newer one
> * replaced host SCSI controller
> * uncabled and physically moved storage to a different part of data center
>
> Again, all of this has taken place (over the years) while our machine
> was fully operational.
>
So you are saying that you were able to replace the RAID controller
without turning off the machine? I realize there does exist
hot-swappable PCI cards, but I think you are overstating what you mean
by "fully operational". For instance, it's not like you can access your
data while it is being physically moved.

I do think you had some nice hardware. But I know you can do all of this
in software as well. It is usually a price/performance tradeoff. You
spend quite a bit to get a hardware RAID card that can keep up with a
modern CPU. I know we have an FC raid box at work which has a full 512MB
of cache on it, but it wasn't that much cheaper than buying a dedicated
server.

John
=:->

Attachment

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Postgresql on an AMD64 machine
Next
From: Tom Lane
Date:
Subject: Re: Need help to decide Mysql vs Postgres