Re: Tables on multiple disk drives - Mailing list pgsql-performance

From scott.marlowe
Subject Re: Tables on multiple disk drives
Date
Msg-id Pine.LNX.4.33.0402171045480.30812-100000@css120.ihs.com
Whole thread Raw
In response to Re: Tables on multiple disk drives  (Craig Thomas <craiger@osdl.org>)
Responses Re: Tables on multiple disk drives  (Craig Thomas <craiger@osdl.org>)
Re: Tables on multiple disk drives  (Josh Berkus <josh@agliodbs.com>)
List pgsql-performance
On Tue, 17 Feb 2004, Craig Thomas wrote:

> > On Tue, 17 Feb 2004, Konstantin Tokar wrote:
> >
> >> Hi!
> >> Does PostgreSQL allow to create tables and indices of a single
> >> database on multiple disk drives with a purpose of increase
> >> performance as Oracle database does? If a symbolic reference is the
> >> only method then the next question is: how can it be determined what
> >> file is referred to what table and index?
> >
> > You're life will be simpler, and your setup will be faster without
> > having  to muck about with it, if you just buy a good RAID controller
> > with battery  backed cache.  LSI/Megaraid and Adaptec both make
> > serviceable controllers  for reasonable prices, and as you add drives,
> > the speed just goes up, no  muddling around with sym links.
>
> This works to a limited extent.  For very large databases, maximum
> throughput of I/O is the paramount factor for database performance.  With
> raid controllers, your LUN is still limited to a small number of disks.
> PostgreSQL can only write on a file system, but Oracle, SAP DB, DB2, etc
> can write directly to disk (raw I/O).  With large databases it is
> advantageous to spread a table across 100's of disks, if the table is
> quite large.  I don't know of any manufacturer that creates a 100 disk
> raid array yet.

You can run up to four LSI / Megaraids in one box, each with 3 UW SCSI
interfaces, and they act as one unit.  That's 3*4*15 = 180 disks max.

With FC AL connections and four cards, it would be possible to approach
1000 drives.

Of course, I'm not sure how fast any RAID card setup is gonna be with that
many drives, but ya never know.  My guess is that before you go there you
buy a big external RAID box built for speed.  We have a couple of 200+
drive external RAID5 storage boxes at work that are quite impressive.

> Some of the problem can be addressed by using a volume manager (such as
> LVM in Linux, or Veritas on Unix-like systems).  This allows one to
> create a volume using partitions from many disks.  One can then create
> a file system and mount it on the volume.

Pretty much RAID arrays in software, which means no battery backed cache,
which means it'll be fast at reading, but probably pretty slow at writes,
epsecially if there's a lot of parallel access waiting to write to the
database.

> However, to get the best performance, Raw I/O capability is the best
> way to go.

Unsupported statement made as fact.  I'm not saying it can't or isn't
true, but my experience has been that large RAID5 arrays are a great
compromise between maximum performance and reliability, giving a good
measure of each.  It doesn't take 100 drives to do well, even a dozen to
two dozen will get you in the same basic range as splitting out files by
hand with sym links without all the headache of chasing down the files,
shutting down the database, linking it over etc...



pgsql-performance by date:

Previous
From: Craig Thomas
Date:
Subject: Re: Tables on multiple disk drives
Next
From: Craig Thomas
Date:
Subject: Re: Tables on multiple disk drives