Re: Hardware estimation - Mailing list pgsql-general

From scott.marlowe
Subject Re: Hardware estimation
Date
Msg-id Pine.LNX.4.33.0211071705150.9047-100000@css120.ihs.com
Whole thread Raw
In response to Re: Hardware estimation  ("Steve Wolfe" <nw@codon.com>)
Responses Re: Hardware estimation  (Lamar Owen <lamar.owen@wgcr.org>)
List pgsql-general
On Thu, 7 Nov 2002, Steve Wolfe wrote:

> > This is also in a rack mount box with dual Hot swappable power supplies,
> > it can map out bad memory on the fly automatically, and can provide REAL
> 5
> > 9 reliability.  32 MEGS of L3 cache, 6 megs of L2 cache, PER PROCESSOR.
> A
> > backend connection with GIGs of data bandwidth per second.  An Intel
> based
> > box isn't even close to being in the same class.
>
>   And I didn't say that it was in the same class.  You said that it was
> about the same price, and I refuted that.  That's all I was saying.  I'm
> not a zealot or a devotee of any one class of server, by any means!

Hold on there. I didn't mean to sound snippy when I asked that.  It's
pretty common for folks to think of intel hardware being equivalent when
it has similar clock rates, and I was only pointing out that there is more
to a server than numbers.  That's all.  For the amount of performance
you're getting, you'd easily spend that much on a Xeon and still not catch
up.

> > Are these white box prices, or from someone like IBM or Dell?  The best
> > price I've seen with that configuration is about $20k. Is that ECC DDR
> > memory?
>
>   White-box, and yes, it's registered/ECC DDR.

Cool.  We use a local builder for all our intel boxen, and get much better
deals than we would from the big boys.  Of course, we had to fight tooth
and at first to get them to build quality units (they had really poor ESD
procedures in place, and our return rate was about 25%)

> > Keep in mind, that out of that 8 gigs of ram, only 1.5 or so is gonna be
> > available for Postgresql.  The rest will be system cache.  On a 64 bit
> > machine you can give as much as you want to the database.
>
>    Actually, any one process will only be able to use the ~1.5 gigs - and
> PG forks off new processes for each backend, so you are able to make use
> of all 8 gigs - although I do admit that having a larger address space can
> be advantageous.

Sorry, but that is incorrect.  Postgresql uses a single large memory
segment for all its shared buffers.  While sorts could use the extra
memory, the database itself is limited to the maximum single largest
shared memory segment you can allocate, and on 32 bit intel, that is
something under 2 gig with linux and BSD both.

> > If you do wanna look at 64 bit systems that are Intel based then Dell
> > sells a quad Itanium for a fair price, but by the time you've  upped it
> to
> > 8 gigs and a pair of 36 gig hard drives, and subtracted their gold star
> > on site support, the price is $46k.  For 4 800 MHz CPUs.
> >
> > A Dell quad Xeon 1.6Gig with 8 gig ram is $29k  IBM is about $20k
>
>    Several years ago, I was in the market for a quad P3 Xeon, and prices
> from the "big names" were about the same.  I built one based on a
> Supermicro chassis and motherboard for something like $12,000, including a
> fairly decent SCSI RAID array.

Flashback.  Five years ago when I first started working here (ihs), I
built our PDC/BDC pair on a Supermicro Dual PPro-200 motherboard.  Having
long since moved on to web development and such, I never expected to see
them again.

Then, walking down the hall past the equipment cage, there they were.
They were being retired.  The bigger one of the two is now my test server
happily running along under my desk.  Running Linux now instead of Windows
NT.

Supermicro makes kick ass mobos, IMnsvHO

> I've worked with Compaq's servers before,
> and haven't seen much advantage.  Yes, they have all of the fancy features
> that management thinks are necessary for uptime, but when the rubber meats
> the road, the machine I built has run for over two years with absolutely
> *NO* downtime other than a few planned shutdowns for planned hardware or
> kernel upgrades.  Eventually, it was demoted from production DB server to
> developmental server, simply because we needed more horsepower than it
> could provide.  (A dual Athlon filled the spot nicely.)

but I wasn't really talking about the advantages of Compaq or Dell Intel
boxen, I was mainly pointing out the advantage of the bigger iron RISC
boxen running unix or linux.  There, it gets kinda hard to build your own,
but not impossible.  There are some companies that sell Dual USparc clone
motherboards in ATX form factor.

But the reason I asked if it was white box was that I was looking for a
fair dollar comparison of the RISC versus Intel.  Both can be had cheaper
than what I was quoting, but not from a big name.  Plus most companies
usually have some silly policy about buying everything from one or two
companies, so I'd bet the guy asking the question can buy any machine he
wants to, as long as it says IBM on the front. :-)

And, fwiw, I hate compaq boxes.  Dell I can live with, but Compaq gives me
stomach ulcers.  Everything is proprietary, and anything you try to do it
a pain with those things.

>    It should also be noted that simply going to a 64-bit architecture
> isn't a magic cure-all.  Right after I built the machine I just spoke of,
> a Compaq rep tried to win us over, and loaned us a $25,000 dual-CPU Alpha
> for a week.  I ran some PostgreSQL stress tests on it with some of our
> production data, and the Xeon handily kept up with or beat the Alpha, at
> half of the price.  Now if I was doing some raytracing, I'm sure that the
> outcome would have been very different, but for database work, it just
> didn't cut it.

The alpha was one of the very first chips to really focus on floating
point over integer operation.  When it came out we were using HP K class
machines to build database servers (running O****e) with very fast integer
performance.  The early K class machines literally stomped the Alphas into
the ground on database performance.  I think they had 4 integer processors
and one FPU back then.  This was especially true when under simo load.
Say a hundred or so database users at a time.  The intel boxes then were
in the 200 to 300 MHz range, i.e. just after the PPro and at the
beginning of the PII range.  The HP was something like 150MHz.  The intel
boxes were actually a little faster than the 150MHz alphas back then, but
no match for the HPs.  I think we has an RS6000 too, and it was close to
the HPs, but the version of AIX on it was just horrible to administer
(that from the guys who adminned it, I never had to actually touch that
box.)

Modern 64 bit CPUs like the USparc III and Power4 are very fast at both FP
and IP operations.  The big advantage is addressable memory and VERY large
L2/L3 caches.

>    In the end, it's the same argument that gets hashed over in various
> forms:  When it comes to commodity vs. specialized hardware, commodity
> hardware is always going to be a cheaper way to get things done within the
> realm of it's capabilities, but you eventually come to a performance level
> where commodity hardware just won't cut it any more.  That's where the
> specialized hardware comes in, be it a high-end server like the Power4, a
> high-end router for an OC192, or a CAD/CAM graphics card.

Yep.

Now, if Postgresql could run in a load balanced cluster, I'd go dual
athlons all the way.  Racks full of them.

But if you're handling a database of 200 Gigabytes like the original
poster was looking at, and you're stuck running it on one machine, fast
IO AND a huge buffer memory really help, and 32 bit intel with postgresql
really does have a serious limit on maximum buffer memory that 64 bit
architechtures don't suffer from.

So, even a 2 way 400 MHz box (say a two year old USparc) with 2Megs or
more of L2/3 cache, that can hold say 8 gigs of ram and let postgresql use
most of it for buffer is likely a better choice than a quad Xeon 1.5GHz,
if it can hold more of the data you're accessing in memory and keep you
away from disk IO.  This is especially true if you're gonna have a high
simo load.


pgsql-general by date:

Previous
From: "W. A. Sanchez"
Date:
Subject: rebuilding pg_xlog from base files
Next
From: Lamar Owen
Date:
Subject: Re: Hardware estimation