Thread: n00b RAID + wal hot standby question

n00b RAID + wal hot standby question

From
"Anton Melser"
Date:
Hi,
I am just starting at a company and we are inheriting a previously
built solution. It looks pretty good but my previous experience with
pg is seriously small-time compared with this...
I am very new at the job, and don't know what hd config we have but it
will be RAID-something I imagine (hey I was working with desktop
"servers" before this!). If that is very important I can find out. We
seem to be saving our WAL to the same partition as PGDATA, and I
notice that we are maxing out a reasonable looking server. The db is
not very big (~4gig, 400meg pgdump), and though I can't see any vacuum
strategy (autovacuum on a 8.1.4 is disabled), we didn't have as much
consistent CPU usage at my old job (with a 6 gig db and MUCH less CPU
and RAM, and probably as many connections), and my vacuum strategy was
also pitiful!  Sure, completely different environments, but I am
thinking that WAL replication could be a factor.
So my question... being in complete ignorance of how RAID works (the
performance details)... would it be better to try and separate the WAL
destination from PGDATA? How much of a difference could it make?
Should we wait till the customer starts complaining (no explosion in
traffic/db size realistic for the foreseeable future...)?
Any abuse welcome.
Cheers
Antoine

Re: n00b RAID + wal hot standby question

From
"Brandon Aiken"
Date:
Unless you can separate PGDATA and the WAL destination to be on wholly
independent physical disks and not just different partitions of the same
hardware array, the physical limitations will still be present.

I believe the recommended method is to use RAID 5 or RAID 10 data
partitions and then use RAID 1 for transaction logs.  Additionally,
you're supposed to match the stripes size of the arrays to the block
sizes of your database, but I can never remember the math involved to do
it.

Database guides like this are still a bit beyond what I can understand:
http://www.lc.leidenuniv.nl/awcourse/oracle/server.920/a96520/hardware.h
tm

This one is a bit easier:
http://www.dbazine.com/oracle/or-articles/ault1

This is the best RAID primer I've seen:
http://www.pcguide.com/ref/hdd/perf/raid/index.htm



I'm not convinced I/O is your problem, though.  High CPU and memory
usage is indicative of many different problems, and poor disk I/O is
usually not one of them.  In a modern system, I'd expect to see poor
disk I/O causing *low* CPU usage combined with poor SELECT and awful
INSERT/UPDATE/DELETE performance.  Maybe it's caching the database state
in memory while it's waiting for writing, though.

It seems more likely that the database is either pushing more
transactions per minute, pushing more complex transactions, dealing with
larger queries and result sets, maintaining more indexes, or running
more complex pl/SQL procedures, triggers, and constraints.

Additionally, if my understanding is right then running with autovacuum
disabled and no batch process vacuum strategy on a database with lots of
INSERTs and DELETEs is essentially like running without indexes.


--
Brandon Aiken
CS/IT Systems Engineer

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Anton Melser
Sent: Monday, December 04, 2006 4:11 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] n00b RAID + wal hot standby question

Hi,
I am just starting at a company and we are inheriting a previously
built solution. It looks pretty good but my previous experience with
pg is seriously small-time compared with this...
I am very new at the job, and don't know what hd config we have but it
will be RAID-something I imagine (hey I was working with desktop
"servers" before this!). If that is very important I can find out. We
seem to be saving our WAL to the same partition as PGDATA, and I
notice that we are maxing out a reasonable looking server. The db is
not very big (~4gig, 400meg pgdump), and though I can't see any vacuum
strategy (autovacuum on a 8.1.4 is disabled), we didn't have as much
consistent CPU usage at my old job (with a 6 gig db and MUCH less CPU
and RAM, and probably as many connections), and my vacuum strategy was
also pitiful!  Sure, completely different environments, but I am
thinking that WAL replication could be a factor.
So my question... being in complete ignorance of how RAID works (the
performance details)... would it be better to try and separate the WAL
destination from PGDATA? How much of a difference could it make?
Should we wait till the customer starts complaining (no explosion in
traffic/db size realistic for the foreseeable future...)?
Any abuse welcome.
Cheers
Antoine

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Re: n00b RAID + wal hot standby question

From
Scott Marlowe
Date:
On Mon, 2006-12-04 at 15:11, Anton Melser wrote:
> Hi,
> I am just starting at a company and we are inheriting a previously
> built solution. It looks pretty good but my previous experience with
> pg is seriously small-time compared with this...

OK, how you set up RAID depends largely on how you'll be using your
database.

First things first though...

Since your vacuum looks to have been turned off for a while, I'd
recommend doing a vacuum full and a reindexdb and see if that helps,
then either schedule regular backups or let autovacuum do it from now
on.

After that, you can worry about setting up your RAID.  Generally RAID 10
is better for transactional dbs (airline reservations, e commerce
etc...) while RAID5 or 50 is often better for large reporting
databases.  This depends on your controller of course.  Some can layer
RAID levels and some can't, and some that can probably shouldn't try
(i.e. their performance is really bad when layering RAID levels.)

Putting your xlog on physically separate disks helps IF you are writing
enough to justify it.  If you run bulk updates at 2 in the morning and
the rest of the day <0.1% of your db activity is insert / update then
you would be better spending your time elsewhere.

Re: n00b RAID + wal hot standby question

From
Casey Duncan
Date:
On Dec 4, 2006, at 1:11 PM, Anton Melser wrote:

> Hi,
> I am just starting at a company and we are inheriting a previously
> built solution. It looks pretty good but my previous experience with
> pg is seriously small-time compared with this...
> I am very new at the job, and don't know what hd config we have but it
> will be RAID-something I imagine (hey I was working with desktop
> "servers" before this!). If that is very important I can find out. We
> seem to be saving our WAL to the same partition as PGDATA, and I
> notice that we are maxing out a reasonable looking server. The db is
> not very big (~4gig, 400meg pgdump), and though I can't see any vacuum
> strategy (autovacuum on a 8.1.4 is disabled), we didn't have as much
> consistent CPU usage at my old job (with a 6 gig db and MUCH less CPU
> and RAM, and probably as many connections), and my vacuum strategy was
> also pitiful!  Sure, completely different environments, but I am
> thinking that WAL replication could be a factor.
> So my question... being in complete ignorance of how RAID works (the
> performance details)... would it be better to try and separate the WAL
> destination from PGDATA? How much of a difference could it make?
> Should we wait till the customer starts complaining (no explosion in
> traffic/db size realistic for the foreseeable future...)?
> Any abuse welcome.

When you say maxing out, what do you mean? Posting some vmstat output
under load would be very helpful (assuming *nix, otherwise can't help
you ;^).

My very general RAID recommendation would be:

Put the wal on a 2 disk RAID-1 array either by itself or together
with the system if you expect little disk activity from the rest of
the system.

Put the data on a RAID-10 array (at least 4 disks, but more would be
much better).

As for the hardware itself, we've had good luck with 10k rpm WD
raptor SATA drives + 3ware 95xx raid controllers (the 9550SX being
preferred due to their smarter caching and higher bandwidth).

THe most important thing though is too keep as much of the database
in RAM as possible. That means putting as much RAM in the database
box as you can afford and not sharing it with other processes (i.e.,
move other RAM-hungry things to another box). And configure postgres
to use the available RAM by tuning shared_buffers, work_mem, etc.

Tuning checkpoint_segments and wal_buffers can help with write
performance.

And running autovacuum (or regular full database vacuums) is very
important to the performance of the database over time. Otherwise you
will definitely notice significant performance degradation as the
garbage in the tables mounts (unless of course the db is read-only).
If it's been off for a long time (or never run), you'll probably need
to spend some quality time doing a "VACUUM FULL" and possibly a
"REINDEX DATABASE" first.

-Casey

Re: n00b RAID + wal hot standby question

From
"Anton Melser"
Date:
Thanks all for your very insightful and helpful answers. I will be
able to really spend some time thinking about how the db will evolve
(and so whether it is worth thinking about a change) in a week or so
and will be able to think more on your answers then.
Cheers
Antoine