Re: Reliability recommendations - Mailing list pgsql-performance

From Mark Lewis
Subject Re: Reliability recommendations
Date
Msg-id 1140024736.9076.167.camel@archimedes
Whole thread Raw
In response to Re: Reliability recommendations  ("Craig A. James" <cjames@modgraph-usa.com>)
List pgsql-performance
Machine 1: $2000
Machine 2: $2000
Machine 3: $2000

Knowing how to rig them together and maintain them in a fully fault-
tolerant way: priceless.


(Sorry for the off-topic post, I couldn't resist).

-- Mark Lewis

On Wed, 2006-02-15 at 09:19 -0800, Craig A. James wrote:
> Jeremy Haile wrote:
> > We are a small company looking to put together the most cost effective
> > solution for our production database environment.  Currently in
> > production Postgres 8.1 is running on this machine:
> >
> > Dell 2850
> > 2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
> > 4 GB DDR2 400 Mhz
> > 2 x 73 GB 10K SCSI RAID 1 (for xlog and OS)
> > 4 x 146 GB 10K SCSI RAID 10 (for postgres data)
> > Perc4ei controller
> >
> > ... I sent our scenario to our sales team at Dell and they came back with
> > all manner of SAN, DAS, and configuration costing as much as $50k.
>
> Given what you've told us, a $50K machine is not appropriate.
>
> Instead, think about a simple system with several clones of the database and a load-balancing web server, even if one
machinecould handle your load.  If a machine goes down, the load balancer automatically switches to the other. 
>
> Look at the MTBF figures of two hypothetical machines:
>
>  Machine 1: Costs $2,000, MTBF of 2 years, takes two days to fix on average.
>  Machine 2: Costs $50,000, MTBF of 100 years (!), takes one hour to fix on average.
>
> Now go out and buy three of the $2,000 machines.  Use a load-balancer front end web server that can send requests
round-robinfashion to a "server farm".  Clone your database.  In fact, clone the load-balancer too so that all three
machineshave all software and databases installed.  Call these A, B, and C machines. 
>
> At any given time, your Machine A is your web front end, serving requests to databases on A, B and C.  If B or C goes
down,no problem - the system keeps running.  If A goes down, you switch the IP address of B or C and make it your web
frontend, and you're back in business in a few minutes. 
>
> Now compare the reliability -- in order for this system to be disabled, you'd have to have ALL THREE computers fail
atthe same time.  With the MTBF and repair time of two days, each machine has a 99.726% uptime.  The "MTBF", that is,
theexpected time until all three machines will fail simultaneously, is well over 100,000 years!  Of course, this is
silly,machines don't last that long, but it illustrates the point:  Redundancy is beats reliability (which is why RAID
isso useful).  
>
> All for $6,000.
>
> Craig
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

pgsql-performance by date:

Previous
From: "Craig A. James"
Date:
Subject: Re: Reliability recommendations
Next
From: Scott Marlowe
Date:
Subject: Re: out of memory