Re: Need for speed - Mailing list pgsql-performance
From | Ron |
---|---|
Subject | Re: Need for speed |
Date | |
Msg-id | 6.2.3.4.0.20050817140116.05d317b0@pop.earthlink.net Whole thread Raw |
In response to | Need for speed (Ulrich Wisser <ulrich.wisser@relevanttraffic.se>) |
Responses |
Re: Need for speed
|
List | pgsql-performance |
At 05:15 AM 8/17/2005, Ulrich Wisser wrote: >Hello, > >thanks for all your suggestions. > >I can see that the Linux system is 90% waiting for disc io. A clear indication that you need to improve your HD IO subsystem if possible. >At that time all my queries are *very* slow. To be more precise, your server performance at that point is essentially equal to your HD IO subsystem performance. > My scsi raid controller and disc are already the fastest available. Oh, REALLY? This is the description of the system you gave us: "We have a box with Linux Fedora Core 3, Postgres 7.4.2 Intel(R) Pentium(R) 4 CPU 2.40GHz 2 scsi 76GB disks (15.000RPM, 2ms)" The is far, Far, FAR from the "the fastest available" in terms of SW, OS, CPU host, _or_ HD subsystem. The "fastest available" means 1= you should be running PostgreSQL 8.0.3 2= you should be running the latest stable 2.6 based kernel 3= you should be running an Opteron based server 4= Fibre Channel HDs are slightly higher performance than SCSI ones. 5= (and this is the big one) YOU NEED MORE SPINDLES AND A HIGHER END RAID CONTROLLER. Your description of you workload was: "one of our services is click counting for on line advertising. We do this by importing Apache log files every five minutes. This results in a lot of insert and delete statements. At the same time our customers shall be able to do on line reporting." There are two issues here: 1= your primary usage is OLTP-like, but you are also expecting to do reports against the same schema that is supporting your OLTP-like usage. Bad Idea. Schemas that are optimized for reporting and other data mining like operation are pessimal for OLTP-like applications and vice versa. You need two schemas: one optimized for lots of inserts and deletes (OLTP-like), and one optimized for reporting (data-mining like). 2= 2 spindles, even 15K rpm spindles, is minuscule. Real enterprise class RAID subsystems have at least 10-20x that many spindles, usually split into 6-12 sets dedicated to different groups of tables in the DB. Putting xlog on its own dedicated spindles is just the first step. The absolute "top of the line" for RAID controllers is something based on Fibre Channel from Xyratex (who make the RAID engines for EMC and NetApps), Engino (the enterprise division of LSI Logic who sell mostly to IBM. Apple has a server based on an Engino card), or dot-hill (who bought Chaparral among others). I suspect you can't afford them even if they would do business with you. The ante for a FC-based RAID subsystem in this class is in the ~$32K to ~$128K range, even if you buy direct from the actual RAID HW manufacturer rather than an OEM like EMC, IBM, or NetApp who will 2x or 4x the price. OTOH, these subsystems will provide OLTP or OLTP-like DB apps with performance that is head-and-shoulders better than anything else to be found. Numbers like 50K-200K IOPS. You get what you pay for. In the retail commodity market where you are more realistically going to be buying, the current best RAID controllers are probably the Areca cards ( www.areca.us ). They come darn close to saturating the Real World Peak Bandwidth of a 64b 133MHz PCI-X bus and have better IOPS numbers than their commodity brethren. However, _none_ of the commodity RAID cards have IOPS numbers anywhere near as high as those mentioned above. >To avoid aggregating to many rows, I already made some aggregation >tables which will be updated after the import from the Apache >logfiles. That did help, but only to a certain level. > >I believe the biggest problem is disc io. Reports for very recent >data are quite fast, these are used very often and therefor already >in the cache. But reports can contain (and regulary do) very old >data. In that case the whole system slows down. To me this sounds >like the recent data is flushed out of the cache and now all data >for all queries has to be fetched from disc. I completely agree. Hopefully my above suggestions make sense and are of use to you. >My machine has 2GB memory, ...and while we are at it, OLTP like apps benefit less from RAM than data mining ones, but still 2GB of RAM is just not that much for a real DB server... Ron Peacetree
pgsql-performance by date: