Thread: Postgresql Hardware - Recommendations

Postgresql Hardware - Recommendations

From
Date:
Hello,

My company has decided to migrate our Oracle database to postgresql8. We
will aquire a new server for this, and would very much appreciate your
advice.

NOTE: The applications accessing the database are developed and
maintained externally, and unfortunately, the developers have not yet
given us detailed information on their requirements. The only info I can
give so far is that the database size is about 60GB, and that it will be
frequently accessed by multiple users (about 100 will be connected
during business hours). The applications accessing the database are
mostly reporting tools.

I know that the performance question will ultimately boil down to "it
depends what you want to do with it", but at the moment I'm very much
interested if there are any general issues we should look out for.

The questions we are asking us now are:

1) Intel or AMD (or alternate Platform)
Are we better of with Xeons or Opterons? Should we consider the IBM
OpenPower platform?

2) CPUs vs cache
Would you rather have more CPUs or more cache? Eg: 4x Xeon 1MB vs 2x
Xeon 8MB

3) CPUs vs Memory
Would you rather have 4x CPUs and 8GB of memory, or 2x CPUs with 16GB of
memory?

Thanks in advance for all your replies!

Best Regards,
Christian Kastner

Re: Postgresql Hardware - Recommendations

From
"J. Andrew Rogers"
Date:


On 9/5/05 6:50 AM, "Christian.Kastner@gutmann.at"
<Christian.Kastner@gutmann.at> wrote:
> The questions we are asking us now are:
>
> 1) Intel or AMD (or alternate Platform)
> Are we better of with Xeons or Opterons? Should we consider the IBM
> OpenPower platform?


Opteron spanks Xeon for database loads.  Advantage AMD, and you generally
won't have to spend much extra money for the privilege.  I've never used
Postgres on the IBM OpenPower platform, but I would expect that it would
perform quite well, certainly better than the Xeons and probably competitive
with the Opterons in many respects -- I am not sufficiently knowledgeable to
make a definitive recommendation.


> 2) CPUs vs cache
> Would you rather have more CPUs or more cache? Eg: 4x Xeon 1MB vs 2x
> Xeon 8MB


I would expect that cache sizes are relatively unimportant compared to
number of processors, but it would depend on the specifics of your load.
Cache coherence is a significant issue for high concurrency database
applications, and a few megabytes of cache here and there will likely make
little difference for a 60GB database.  Databases spend most of their time
playing in main memory, not in cache.  The biggest advantage I can see to
bigger cache would be connection scaling, in which case you'll probably buy
more mileage with more processors.

There are a lot of architecture dependencies here.  Xeons scale badly to 4
processors, Opterons scale just fine.



> 3) CPUs vs Memory
> Would you rather have 4x CPUs and 8GB of memory, or 2x CPUs with 16GB of
> memory?


Uh, for what purpose?  CPU and memory are not fungible, so how you
distribute them depends very much on your application.  You can never have
too much memory for a large database, but having extra processors on a
scalable architecture is pretty nice too.  What they both buy you is not
really related.

The amount of memory you need is determined by the size of your cache-able
working set and the nature of your queries.  Spend whatever money is left on
the processors; if your database spends all its time waiting for disks, no
quantity of processors will help you unless you are doing a lot of math on
the results.


YMMV, as always.  Recommendations more specific than "Opterons rule, Xeons
suck" depend greatly on what you plan on doing with the database.


Cheers,

J. Andrew Rogers




Re: Postgresql Hardware - Recommendations

From
Matthew Nuzum
Date:
On 9/5/05, Christian.Kastner@gutmann.at <Christian.Kastner@gutmann.at> wrote:
> ... The only info I can
> give so far is that the database size is about 60GB, and that it will be
> frequently accessed by multiple users (about 100 will be connected
> during business hours). The applications accessing the database are
> mostly reporting tools.

Optimizing hardware for mostly selects is different than optimizing
for lots of inserts. You will get good responses from this list if you
can give a little more details. Here are some questions:
How do you get your data into the db? Do you do bullk loads at
periodic intervals during the day? Do you do frequent updates/inserts?

You say reporting, do you use many stored procedures and calculations
on the server side? I've used some reporting apps that simply grab
tons of data from the server and then process it on the client side
(ODBC apps seem to do this), while other applications formulate the
queries and use stored procedures in order to transfer little data.

Of your 60GB, how much of that is active? Does your budget allow you
to buy enough RAM to get your active data into the disk cache? For
reporting, this *might* be your biggest win.

Here are some scenarios:
S1: Bulk uploads once or twice daily of about 250 MB of data. Few
inserts and updates during the day (1-2%). Reporting is largely done
on data from the last 5 business days. In this case you have < 2GB of
active data and your disk cache will hold all of your active data in
RAM (provided your db structure is diskcache friendly). An example of
this I have experienced is a sales application that queries current
inventory. Telephone agents queried, quieried, queried the
instock-inventory.

S2: Same as above but reporting is largely done on data covering 200+
business days. Its doubtful that you will get 50GB of RAM in your
server, you need to focus on disk speed. An example of this I have
experienced was an application that looked at sales trends and
performed commission calculations and projected sales forecasts.

S3: Lots of inserts/updates throughout the day (15 - 25%) - you need
to focus on disk speed. The content management system my employer
develops fits this model.

> 3) CPUs vs Memory
> Would you rather have 4x CPUs and 8GB of memory, or 2x CPUs with 16GB of
> memory?

Very hard to say without knowing your application. I have limited
experience but what I've found is that applications that support
multiple db architectures do not fully utilize the database server and
CPU utilization is low. Disk and network i/o is high. I don't know if
your application supports multiple backeneds, but chances are good
your biggest wins will come from RAM, disk and network investments.

--
Matthew Nuzum
www.bearfruit.org

Re: Postgresql Hardware - Recommendations

From
Date:
Andrew, Matthew, thanks to you both four your advice. I'm sorry I couldn't provide more details to the situation, I
willpost again as soon I get them.  

Time to share your insights with the colleagues :)

Best Regards,
Chris

-----Ursprüngliche Nachricht-----
Von: Paul Ramsey [mailto:pramsey@refractions.net]
Gesendet: Dienstag, 06. September 2005 06:13
An: Kastner Christian; Kastner Christian
Betreff: Re: [PERFORM] Postgresql Hardware - Recommendations

For a database, I would almost always prioritize:
- I/O
- RAM
- CPU

So, fast drives (SCSI 10000RPM or better in a RAID configuration,
more spindles == more throughput), then memory (more memory == more
of the database off disk in cache == faster response), then more CPU
(more concurrent request handling).

Paul

On 5-Sep-05, at 6:50 AM, <Christian.Kastner@gutmann.at>
<Christian.Kastner@gutmann.at> wrote:

> Hello,
>
> My company has decided to migrate our Oracle database to
> postgresql8. We
> will aquire a new server for this, and would very much appreciate your
> advice.
>
> NOTE: The applications accessing the database are developed and
> maintained externally, and unfortunately, the developers have not yet
> given us detailed information on their requirements. The only info
> I can
> give so far is that the database size is about 60GB, and that it
> will be
> frequently accessed by multiple users (about 100 will be connected
> during business hours). The applications accessing the database are
> mostly reporting tools.
>
> I know that the performance question will ultimately boil down to "it
> depends what you want to do with it", but at the moment I'm very much
> interested if there are any general issues we should look out for.
>
> The questions we are asking us now are:
>
> 1) Intel or AMD (or alternate Platform)
> Are we better of with Xeons or Opterons? Should we consider the IBM
> OpenPower platform?
>
> 2) CPUs vs cache
> Would you rather have more CPUs or more cache? Eg: 4x Xeon 1MB vs 2x
> Xeon 8MB
>
> 3) CPUs vs Memory
> Would you rather have 4x CPUs and 8GB of memory, or 2x CPUs with
> 16GB of
> memory?
>
> Thanks in advance for all your replies!
>
> Best Regards,
> Christian Kastner
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>