Re: Hardware for PostgreSQL - Mailing list pgsql-performance

From Magnus Hagander
Subject Re: Hardware for PostgreSQL
Date
Msg-id 4728D70B.5090407@hagander.net
Whole thread Raw
In response to Hardware for PostgreSQL  (Ketema <ketema@gmail.com>)
List pgsql-performance
Ketema wrote:
> I am trying to build a very Robust DB server that will support 1000+
> concurrent users (all ready have seen max of 237 no pooling being
> used).  I have read so many articles now that I am just saturated.  I
> have a general idea but would like feedback from others.
>
> I understand query tuning and table design play a large role in
> performance, but taking that factor away
> and focusing on just hardware, what is the best hardware to get for Pg
> to work at the highest level
> (meaning speed at returning results)?
>
> How does pg utilize multiple processors?  The more the better?

If you have many simultaneous queries, it will use more processors. If
you run just a single query at a time, it'll only use one CPU.

> Are queries spread across multiple processors?

No, not a single query. Max one CPU per query.


> Is Pg 64 bit?

Yes, if your OS and platform is.

> If so what processors are recommended?

AFAIK, the latest intels and AMDs are all good, and fairly equal. Make
sure you turn hyperthreading off. Multicore is fine, but not HT.


> I read this : http://www.postgresql.org/files/documentation/books/aw_pgsql/hw_performance/node12.html
> POSTGRESQL uses a multi-process model, meaning each database
> connection has its own Unix process. Because of this, all multi-cpu
> operating systems can spread multiple database connections among the
> available CPUs. However, if only a single database connection is
> active, it can only use one CPU. POSTGRESQL does not use multi-
> threading to allow a single process to use multiple CPUs.
>
> Its pretty old (2003) but is it still accurate?

Yes.


> if this statement is
> accurate how would it affect connection pooling software like pg_pool?

Not at all, really. It's only interesting how many running queries you
have, not how many connections. There are other advantages to the
pg_pool and friends, such as not having to fork new processes so often,
but it doesn't affect the spread over CPUs.


> RAM?  The more the merrier right? Understanding shmmax and the pg
> config file parameters for shared mem has to be adjusted to use it.

Yes. As long as your database doesn't fit entirely in RAM with room over
for sorting and such, more RAM will make things faster in just about
every case.


> Disks?  standard Raid rules right?  1 for safety 5 for best mix of
> performance and safety?

RAID-10 for best mix of performance and safety. RAID-5 can give you a
decent compromise between cost and performance/safety.

And get a RAID controller with lots of cache memory with battery backup.
This is *very* important.

And remember - lots of spindles (disks) if you want good write
performance. Regardless of which RAID you use.


> Any preference of SCSI over SATA? What about using a High speed (fibre
> channel) mass storage device?

Absolutely SCSI or SAS, and not SATA. I see no point with plain FC
disks, but if you get a high end SAN solution with FC between the host
and the controllers, that's what you're going to be using. There are
things to be said both for using DAS and SAN - they both ahve their
advantages.


> Who has built the biggest baddest Pg server out there and what do you
> use?

Probably not me :-) The biggest one I've set up is 16 cores, 32Gb RAM
and no more than 800Gb disk... But it's very fast :-)

Oh, and I'd absolutely recommend you go for brandname hardware, like IBM
or HP (or Sun or something if you don't want to go down the intel path).

//Magnus

pgsql-performance by date:

Previous
From: Joe Uhl
Date:
Subject: Re: hardware and For PostgreSQL
Next
From: Arjen van der Meijden
Date:
Subject: Re: Hardware for PostgreSQL