Re: Need for speed - Mailing list pgsql-performance
From | Ron |
---|---|
Subject | Re: Need for speed |
Date | |
Msg-id | 6.2.3.4.0.20050817130842.05d31a40@pop.earthlink.net Whole thread Raw |
In response to | Re: Need for speed (Ulrich Wisser <ulrich.wisser@relevanttraffic.se>) |
List | pgsql-performance |
At 05:15 AM 8/17/2005, Ulrich Wisser wrote: >Hello, > >thanks for all your suggestions. > >I can see that the Linux system is 90% waiting for disc io. A clear indication that you need to improve your HD IO subsystem. >At that time all my queries are *very* slow. To be more precise, your server performance at that point is essentially equal to your HD IO subsystem performance. > My scsi raid controller and disc are already the fastest available. Oh, REALLY? This is the description of the system you gave us: "We have a box with Linux Fedora Core 3, Postgres 7.4.2 Intel(R) Pentium(R) 4 CPU 2.40GHz 2 scsi 76GB disks (15.000RPM, 2ms)" The is far, Far, FAR from the "the fastest available" in terms of SW, OS, CPU host, _or_ HD subsystem. The "fastest available" means 1= you should be running 8.0.3 2= you should be running the latest stable 2.6 based kernel 3= you should be running an Opteron based server 4= Fibre Channel HDs are higher performance than SCSI ones. 5= (and this is the big one) YOU NEED MORE SPINDLES AND A HIGHER END RAID CONTROLLER. The absolute "top of the line" for RAID controllers is something based on Fibre Channel from Xyratex (who make the RAID engines for EMC and NetApps), Engino (the enterprise division of LSI Logic who sell mostly to IBM. Apple has a server based on an Engino card), dot-hill (who bought Chaparral among others). I suspect you can't afford them even if they would do business with you. The ante for a FC-based RAID subsystem in this class is in the ~$32K to ~$128K range, even if you buy direct from the actual RAID HW manufacturer rather than an OEM like In the retail commodity market, the current best RAID controllers are probably the 16 and 24 port versions of the Areca cards ( www.areca.us ). They come darn close to saturating the the Real World Peak Bandwidth of a 64b 133MHz PCI-X bus. I did put pg_xlog on another file system on other discs. > The query plan uses indexes and "vacuum analyze" is run once a day. That >To avoid aggregating to many rows, I already made some aggregation >tables which will be updated after the import from the Apache >logfiles. That did help, but only to a certain level. > >I believe the biggest problem is disc io. Reports for very recent >data are quite fast, these are used very often and therefor already >in the cache. But reports can contain (and regulary do) very old >data. In that case the whole system slows down. To me this sounds >like the recent data is flushed out of the cache and now all data >for all queries has to be fetched from disc. > >My machine has 2GB memory,
pgsql-performance by date: