Re: Performance while loading data and indexing - Mailing list pgsql-general

From scott.marlowe
Subject Re: Performance while loading data and indexing
Date
Msg-id Pine.LNX.4.33.0209270907500.9417-100000@css120.ihs.com
Whole thread Raw
In response to Re: Performance while loading data and indexing  (Mats Lofkvist <mal@algonet.se>)
List pgsql-general
On 27 Sep 2002, Mats Lofkvist wrote:

> shridhar_daithankar@persistent.co.in ("Shridhar Daithankar") writes:
>
> [snip]
> >
> > Couple MB of data per sec. to disk is just not saturating it. It's a RAID 5
> > setup..
> >
>
> RAID5 is not the best for performance, especially write performance.
> If it is software RAID it is even worse :-).

I take exception to this.  RAID5 is a great choice for most folks.

1:  RAID5 only writes out the parity stripe and data stripe, not all
stripes when writing.  So, in an 8 disk RAID5 array, writing to a single
64 k stripe involves one 64k read (parity stripe) and two 64k writes.

On a mirror set, writing to one 64k stripe involves two 64k writes.  The
difference isn't that great, and in my testing, a large enough RAID5
provides so much faster read speads by spreading the reads across so many
heads as to more than make up for the slightly slower writes.  My testing
has shown that a 4 disk RAID5 can generally run about 85% or more the
speed of a mirror set.

2:  Why does EVERYONE have to jump on the bandwagon that software RAID 5
is bad.  My workstation running RH 7.2 uses about 1% of the CPU during
very heavy parallel access (i.e. 50 simo pgbenchs) at most.  I've seen
many hardware RAID cards that are noticeable slower than my workstation
running software RAID.  You do know that hardware RAID is just software
RAID where the processing is done on a seperate CPU on a card, but it's
still software doing the work.

3:  We just had a hardware RAID card mark both drives in a mirror set bad.
It wouldn't accept them back, and all the data was gone.  poof.  That
would never happen in Linux's kernel software RAID, I can always make
Linux take back a "bad" drive.


The only difference between RAID5 with n+1 disks and RAID0 with n disks is
that we have to write a parity stripe in RAID5.  It's ability to handle
high parallel load is much better than a RAID1 set, and on average, you
actually write about the same amount with either RAID1 or RAID5.

Don't dog software RAID5, it works and it works well in Linux.  Windows,
however, is another issue.  There, the software RAID5 is pretty pitiful,
both in terms of performance and maintenance.


pgsql-general by date:

Previous
From: "Magnus Naeslund(f)"
Date:
Subject: How do i make use of listen/notify properly
Next
From: "Orr, Steve"
Date:
Subject: pgbench