Re: Hardware for PostgreSQL - Mailing list pgsql-performance

From Arjen van der Meijden
Subject Re: Hardware for PostgreSQL
Date
Msg-id 4728D468.30204@tweakers.net
Whole thread Raw
In response to Hardware for PostgreSQL  (Ketema <ketema@gmail.com>)
List pgsql-performance
On 31-10-2007 17:45 Ketema wrote:
> I understand query tuning and table design play a large role in
> performance, but taking that factor away
> and focusing on just hardware, what is the best hardware to get for Pg
> to work at the highest level
> (meaning speed at returning results)?

It really depends on your budget and workload. Will it be read-heavy or
write-heavy? How large will the database be? Are those concurrent users
actively executing queries or is the actual concurrent query load lower
(it normally is)?
You should probably also try to estimate the amount of concurrently
executed queries and how heavy those queries are, as that is normally
more important as a performance measure. And normally its much less than
the amount of concurrently connected users.

> How does pg utilize multiple processors?  The more the better?
> Are queries spread across multiple processors?

It forks a process for a new connection and leaves the multi-cpu
scheduling to the OS. It does not spread a single query across multiple
cpu's. But with many concurrent users, you normally don't want or need
that anyway, it would mainly add extra stress to the scheduling of your
operating system.

> Is Pg 64 bit?
It can be compiled 64-bit and is available pre-compiled as 64-bits as well.

> If so what processors are recommended?

I think the x86-class cpu's deliver the most bang for buck and are the
best tested with postgres. Both AMD and Intel cpu's are pretty good, but
I think currently a system with two intel quad core cpus is in a very
good price/performance-point. Obviously you'll need to match the cpus to
your load, you may need more cpu-cores.

> Its pretty old (2003) but is it still accurate?  if this statement is
> accurate how would it affect connection pooling software like pg_pool?

It just keeps the process alive as long as the connection isn't closed,
nothing fancy or worrisome going on there. That's just the behavior I'd
expect at the connection pool-level.

> RAM?  The more the merrier right? Understanding shmmax and the pg
> config file parameters for shared mem has to be adjusted to use it.

More is better, but don't waste your money on it if you don't need it,
if your (the active part of your) database is smaller than the RAM,
increasing it doesn't do that much. I would be especially careful with
configurations that require those very expensive 4GB-modules.

> Disks?  standard Raid rules right?  1 for safety 5 for best mix of
> performance and safety?

Make sure you have a battery backed controller (or multiple), but you
should consider raid 10 if you have many writes and raid 5 or 50 if you
have a read-heavy environment. There are also people reporting that it's
faster to actually build several raid 1's and use the OS to combine them
to a raid 10.
Be careful with the amount of disks, in performance terms you're likely
better off with 16x 73GB than with 8x 146GB

> Any preference of SCSI over SATA? What about using a High speed (fibre
> channel) mass storage device?

I'd consider only SAS (serial attached scsi, the successor of scsi) for
a relatively small high performance storage array. Fibre channel is so
much more expensive, that you'll likely get much less performance for
the same amount of money. And I'd only use sata in such an environment
if the amount of storage, not its performance, is the main metric. I.e.
for file storage and backups.

Best regards,

Arjen

pgsql-performance by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Hardware for PostgreSQL
Next
From: Pablo Alcaraz
Date:
Subject: Re: tables with 300+ partitions