Thread: RAID or manual split?

RAID or manual split?

From
Mike Glover
Date:
It seems, that if I know the type and frequency of the queries a
database will be seeing, I could split the database by hand over
multiple disks and get better performance that I would with a RAID array
with similar hardware.  Most of the data is volatile
and easily replaceable (and the rest is backed up independently), so
redundancy isn't importand, and I'm willing to do some ongoing
maintenance if I can get a decent speed boost.  Am I misguided, or might
this work? details of my setup are below:

Six large (3-7 Mrow) 'summary' tables, each being updated continuously
by 5-20 processes with about 0.5 transactions/second/process.

Periodically (currently every two weeks), join queries are
performed between one of the 'summary' tables(same one each time) and
each of the other five.  Each join touches most rows of both tables,
indexes aren't used.  Results are written into a separate group of
'inventory' tables (about 500 Krow each), one for each join.

There are frequent (100-1000/day) queries of both the
inventory and summary tables using the primary key -- always using the
index and returning < 10 rows.

We're currently getting (barely) acceptable performance from a single
15k U160 SCSI disk, but db size and activity are growing quickly.
I've got more disks and a battery-backed LSI card on order.

-mike


--
Mike Glover
GPG Key ID BFD19F2C <mpg4@duluoz.net>

Attachment

Re: RAID or manual split?

From
matt@ymogen.net
Date:
> It seems, that if I know the type and frequency of the queries a
> database will be seeing, I could split the database by hand over
> multiple disks and get better performance that I would with a RAID array
> with similar hardware.

Unlikely, but possible if you had radically different hardware for
different tables.

> Six large (3-7 Mrow) 'summary' tables, each being updated continuously
> by 5-20 processes with about 0.5 transactions/second/process.

Well you should get close to an order of magnitude better performance from
a RAID controller with write-back cache on those queries.

> Periodically (currently every two weeks), join queries are
> performed between one of the 'summary' tables(same one each time) and
> each of the other five.  Each join touches most rows of both tables,
> indexes aren't used.  Results are written into a separate group of
> 'inventory' tables (about 500 Krow each), one for each join.

The more disks the data is spread over the better (the RAID controller
will help here with striping).

> There are frequent (100-1000/day) queries of both the
> inventory and summary tables using the primary key -- always using the
> index and returning < 10 rows.

RAM is what you need, to cache the data and indexes, and then as much CPU
power as you can get.

> We're currently getting (barely) acceptable performance from a single
> 15k U160 SCSI disk, but db size and activity are growing quickly.
> I've got more disks and a battery-backed LSI card on order.

3 or more disks in a stripe set, with write back caching, will almost
certainly give a huge performance boost.  Try that first, and only if you
have issues should you think about futzing with symlinks etc.

M

Re: RAID or manual split?

From
Rod Taylor
Date:
> There are frequent (100-1000/day) queries of both the
> inventory and summary tables using the primary key -- always using the
> index and returning < 10 rows.

For this query frequency I don't think splitting the drives will do much
-- you just need more IO. Look at optimizing the query themselves,
specifically ensuring the useful information is already in memory.

If the 10 rows are recent, you might try using partial indexes with the
last days worth of information instead of an index across the entire
table.

> We're currently getting (barely) acceptable performance from a single
> 15k U160 SCSI disk, but db size and activity are growing quickly.
> I've got more disks and a battery-backed LSI card on order.

Configure for Raid 10 and you're off.

--
Rod Taylor <rbt [at] rbt [dot] ca>

Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
PGP Key: http://www.rbt.ca/signature.asc

Attachment