Thread: Possible Redundancy/Performance Solution

Possible Redundancy/Performance Solution

From

Dennis Muhlestein

Date:

06 May 2008, 13:59:46

Right now, we have a few servers that host our databases.  None of them
are redundant.  Each hosts databases for one or more applications.
Things work reasonably well but I'm worried about the availability of
some of the sites.  Our hardware is 3-4 years old at this point and I'm
not naive to the possibility of drives, memory, motherboards or whatever
failing.

I'm toying with the idea of adding a little redundancy and maybe some
performance to our setup.  First, I'd replace are sata hard drives with
a scsi controller and two scsi hard drives that run raid 0 (probably
running the OS and logs on the original sata drive).  Then I'd run the
previous two databases on one cluster of two servers with pgpool in
front (using the redundancy feature of pgpool).

Our applications are mostly read intensive.  I don't think that having
two databases on one machine, where previously we had just one, would
add too much of an impact, especially if we use the load balance feature
of pgpool as well as the redundancy feature.

Can anyone comment on any gotchas or issues we might encounter?  Do you
think this strategy has possibility to accomplish what I'm originally
setting out to do?

TIA
-Dennis

Re: Possible Redundancy/Performance Solution

From

Greg Smith

Date:

06 May 2008, 14:39:33

On Tue, 6 May 2008, Dennis Muhlestein wrote:

> First, I'd replace are sata hard drives with a scsi controller and two
> scsi hard drives that run raid 0 (probably running the OS and logs on
> the original sata drive).

RAID0 on two disks makes a disk failure that will wipe out the database
twice as likely.  If you goal is better reliability, you want some sort of
RAID1, which you can do with two disks.  That should increase read
throughput a bit (not quite double though) while keeping write throughput
about the same.

If you added four disks, then you could do a RAID1+0 combination which
should substantially outperform your existing setup in every respect while
also being more resiliant to drive failure.

> Our applications are mostly read intensive.  I don't think that having two
> databases on one machine, where previously we had just one, would add too
> much of an impact, especially if we use the load balance feature of pgpool as
> well as the redundancy feature.

A lot depends on how much RAM you've got and whether it's enough to keep
the cache hit rate fairly high here.  A reasonable thing to consider here
is doing a round of standard performance tuning on the servers to make
sure they're operating efficient before increasing their load.

> Can anyone comment on any gotchas or issues we might encounter?

Getting writes to replicate to multiple instances of the database usefully
is where all the really nasty gotchas are in this area.  Starting with
that part and working your way back toward the front-end pooling from
there should crash you into the hard parts early in the process.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Possible Redundancy/Performance Solution

From

Dennis Muhlestein

Date:

06 May 2008, 15:59:49

Greg Smith wrote:
> On Tue, 6 May 2008, Dennis Muhlestein wrote:
>
>
> RAID0 on two disks makes a disk failure that will wipe out the database
> twice as likely.  If you goal is better reliability, you want some sort
> of RAID1, which you can do with two disks.  That should increase read
> throughput a bit (not quite double though) while keeping write
> throughput about the same.

I was planning on pgpool being the cushion between the raid0 failure
probability and my need for redundancy.  This way, I get protection
against not only disks, but cpu, memory, network cards,motherboards etc.
    Is this not a reasonable approach?

>
> If you added four disks, then you could do a RAID1+0 combination which
> should substantially outperform your existing setup in every respect
> while also being more resiliant to drive failure.
>
>> Our applications are mostly read intensive.  I don't think that having
>> two databases on one machine, where previously we had just one, would
>> add too much of an impact, especially if we use the load balance
>> feature of pgpool as well as the redundancy feature.
>
> A lot depends on how much RAM you've got and whether it's enough to keep
> the cache hit rate fairly high here.  A reasonable thing to consider
> here is doing a round of standard performance tuning on the servers to
> make sure they're operating efficient before increasing their load.
>
>> Can anyone comment on any gotchas or issues we might encounter?
>
> Getting writes to replicate to multiple instances of the database
> usefully is where all the really nasty gotchas are in this area.
> Starting with that part and working your way back toward the front-end
> pooling from there should crash you into the hard parts early in the
> process.


Thanks for the tips!
Dennis

Re: Possible Redundancy/Performance Solution

From

Greg Smith

Date:

06 May 2008, 17:35:11

On Tue, 6 May 2008, Dennis Muhlestein wrote:

> I was planning on pgpool being the cushion between the raid0 failure
> probability and my need for redundancy.  This way, I get protection against
> not only disks, but cpu, memory, network cards,motherboards etc.    Is this
> not a reasonable approach?

Since disks are by far the most likely thing to fail, I think it would be
bad planning to switch to a design that doubles the chance of a disk
failure taking out the server just because you're adding some server-level
redundancy.  Anybody who's been in this business for a while will tell you
that seemingly improbable double failures happen, and if were you'd I want
a plan that survived a) a single disk failure on the primary and b) a
single disk failure on the secondary at the same time.

Let me strengthen that--I don't feel comfortable unless I'm able to
survive a single disk failure on the primary and complete loss of the
secondary (say by power supply failure), because a double failure that
starts that way is a lot more likely than you might think.  Especially
with how awful hard drives are nowadays.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Possible Redundancy/Performance Solution

From

Dennis Muhlestein

Date:

06 May 2008, 18:59:50

Greg Smith wrote:
> On Tue, 6 May 2008, Dennis Muhlestein wrote:
>
  > Since disks are by far the most likely thing to fail, I think it would
> be bad planning to switch to a design that doubles the chance of a disk
> failure taking out the server just because you're adding some
> server-level redundancy.  Anybody who's been in this business for a
> while will tell you that seemingly improbable double failures happen,
> and if were you'd I want a plan that survived a) a single disk failure
> on the primary and b) a single disk failure on the secondary at the same
> time.
>
> Let me strengthen that--I don't feel comfortable unless I'm able to
> survive a single disk failure on the primary and complete loss of the
> secondary (say by power supply failure), because a double failure that
> starts that way is a lot more likely than you might think.  Especially
> with how awful hard drives are nowadays.

Those are good points.  So you'd go ahead and add the pgpool in front
(or another redundancy approach, but then use raid1,5 or perhaps 10 on
each server?

-Dennis

Re: Possible Redundancy/Performance Solution

From

"Scott Marlowe"

Date:

06 May 2008, 22:58:21

On Tue, May 6, 2008 at 3:39 PM, Dennis Muhlestein
<djmuhlestein@gmail.com> wrote:

>  Those are good points.  So you'd go ahead and add the pgpool in front (or
> another redundancy approach, but then use raid1,5 or perhaps 10 on each
> server?

That's what I'd do.  specificall RAID10 for small to medium drive sets
used for transactional stuff, and RAID6 for very large reporting
databases that are mostly read.

Re: Possible Redundancy/Performance Solution

From

Greg Smith

Date:

06 May 2008, 23:37:17

On Tue, 6 May 2008, Dennis Muhlestein wrote:

> Those are good points.  So you'd go ahead and add the pgpool in front (or
> another redundancy approach, but then use raid1,5 or perhaps 10 on each
> server?

Right.  I don't advise using the fact that you've got some sort of
replication going as an excuse to reduce the reliability of individual
systems, particularly in the area of disks (unless you're really creating
a much larger number of replicas than 2).

RAID5 can be problematic compared to other RAID setups when you are doing
write-heavy scenarios of small blocks, and it should be avoided for
database use.  You can find stories on this subject in the archives here
and some of the papers at http://www.baarf.com/ go over why; "Is RAID 5
Really a Bargain?" is the one I like best.

If you were thinking about 4 or more disks, there's a number of ways to
distribute those:

1) RAID1+0 to make one big volume
2) RAID1 for OS/apps/etc, RAID1 for database
3) RAID1 for OS+xlog, RAID1 for database
4) RAID1 for OS+popular tables, RAID1 for rest of database

Exactly which of these splits is best depends on your application and the
tradeoffs important to you, but any of these should improve performance
and reliability over what you're doing now.  I personally tend to create
two separate distinct volumes rather than using any striping here, create
a tablespace or three right from the start, and then manage the underlying
mapping to disk with symbolic links so I can shift the allocation around.
That does require you have a steady hand and good nerves for when you
screw up, so I wouldn't recommend that to everyone.

As you get more disks it gets less practical to handle things this way,
and it becomes increasingly sensible to just make one big array out of
them and stopping worrying about it.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Possible Redundancy/Performance Solution

From

Dennis Muhlestein

Date:

07 May 2008, 13:00:07

>
> 1) RAID1+0 to make one big volume
> 2) RAID1 for OS/apps/etc, RAID1 for database
> 3) RAID1 for OS+xlog, RAID1 for database
> 4) RAID1 for OS+popular tables, RAID1 for rest of database

Lots of good info, thanks for all the replies.  It seems to me then,
that the speed increase you'd get from raid0 is not worth the downtime
risk, even when you have multiple servers.  I'll start pricing things
out and see what options we have.

Thanks again,
Dennis