Re: Need for speed - Mailing list pgsql-performance

From Ron
Subject Re: Need for speed
Date
Msg-id 6.2.3.4.0.20050817140116.05d317b0@pop.earthlink.net
Whole thread Raw
In response to Need for speed  (Ulrich Wisser <ulrich.wisser@relevanttraffic.se>)
Responses Re: Need for speed
List pgsql-performance
At 05:15 AM 8/17/2005, Ulrich Wisser wrote:
>Hello,
>
>thanks for all your suggestions.
>
>I can see that the Linux system is 90% waiting for disc io.

A clear indication that you need to improve your HD IO subsystem if possible.


>At that time all my queries are *very* slow.

To be more precise, your server performance at that point is
essentially equal to your HD IO subsystem performance.


>  My scsi raid controller and disc are already the fastest available.

Oh, REALLY?  This is the description of the system you gave us:

"We have a box with
Linux Fedora Core 3, Postgres 7.4.2
Intel(R) Pentium(R) 4 CPU 2.40GHz
2 scsi 76GB disks (15.000RPM, 2ms)"


The is far, Far, FAR from the "the fastest available" in terms of SW,
OS, CPU host, _or_ HD subsystem.

The "fastest available" means
1= you should be running PostgreSQL 8.0.3
2= you should be running the latest stable 2.6 based kernel
3= you should be running an Opteron based server
4= Fibre Channel HDs are slightly higher performance than SCSI ones.
5= (and this is the big one) YOU NEED MORE SPINDLES AND A HIGHER END
RAID CONTROLLER.

Your description of you workload was:
"one of our services is click counting for on line advertising. We do
this by importing Apache log files every five minutes. This results
in a lot of insert and delete statements. At the same time our
customers shall be able to do on line reporting."

There are two issues here:
1= your primary usage is OLTP-like, but you are also expecting to do
reports against the same schema that is supporting your OLTP-like
usage.  Bad Idea.  Schemas that are optimized for reporting and other
data mining like operation are pessimal for OLTP-like applications
and vice versa.  You need two schemas: one optimized for lots of
inserts and deletes (OLTP-like), and one optimized for reporting
(data-mining like).

2= 2 spindles, even 15K rpm spindles, is minuscule.  Real enterprise
class RAID subsystems have at least 10-20x that many spindles,
usually split into 6-12 sets dedicated to different groups of tables
in the DB.  Putting xlog on its own dedicated spindles is just the
first step.

The absolute "top of the line" for RAID controllers is something
based on Fibre Channel from Xyratex (who make the RAID engines for
EMC and NetApps), Engino (the enterprise division of LSI Logic who
sell mostly to IBM.  Apple has a server based on an Engino card), or
dot-hill (who bought Chaparral among others).  I suspect you can't
afford them even if they would do business with you.  The ante for a
FC-based RAID subsystem in this class is in the ~$32K to ~$128K
range, even if you buy direct from the actual RAID HW manufacturer
rather than an OEM like EMC, IBM, or NetApp who will 2x or 4x the
price.  OTOH, these subsystems will provide OLTP or OLTP-like DB apps
with performance that is head-and-shoulders better than anything else
to be found.  Numbers like 50K-200K IOPS.  You get what you pay for.

In the retail commodity market where you are more realistically going
to be buying, the current best RAID controllers are probably the
Areca cards ( www.areca.us ).  They come darn close to saturating the
Real World Peak Bandwidth of a 64b 133MHz PCI-X bus and have better
IOPS numbers than their commodity brethren.  However, _none_ of the
commodity RAID cards have IOPS numbers anywhere near as high as those
mentioned above.


>To avoid aggregating to many rows, I already made some aggregation
>tables which will be updated after the import from the Apache
>logfiles.  That did help, but only to a certain level.
>
>I believe the biggest problem is disc io. Reports for very recent
>data are quite fast, these are used very often and therefor already
>in the cache. But reports can contain (and regulary do) very old
>data. In that case the whole system slows down. To me this sounds
>like the recent data is flushed out of the cache and now all data
>for all queries has to be fetched from disc.

I completely agree.  Hopefully my above suggestions make sense and
are of use to you.


>My machine has 2GB memory,

...and while we are at it, OLTP like apps benefit less from RAM than
data mining ones, but still 2GB of RAM is just not that much for a
real DB server...


Ron Peacetree



pgsql-performance by date:

Previous
From: Ron
Date:
Subject: Re: Need for speed
Next
From: "Merlin Moncure"
Date:
Subject: bitmap scan issues 8.1 devel