Re: Scaling further up - Mailing list pgsql-performance
From | Anjan Dave |
---|---|
Subject | Re: Scaling further up |
Date | |
Msg-id | 4BAFBB6B9CC46F41B2AD7D9F4BBAF785098045@vt-pe2550-001.vantage.vantage.com Whole thread Raw |
In response to | Scaling further up ("Anjan Dave" <adave@vantage.com>) |
Responses |
Re: Scaling further up
|
List | pgsql-performance |
Great response, Thanks. Regarding 12GB memory and 13G db, and almost no I/O, one thing I don't understand is that even though the OS caches most of the memory and PG can use it if it needs it, why would the system swap (not much, only during peak times)? The SHMMAX is set to 512MB, shared_buffers is 150MB, effective cache size is 2GB, sort mem is 2MB, rest is default values. It also happens that a large query (reporting type) can hold up the other queries, and the load averages shoot up during peak times. Regarding a baseline - -We have docs and monitor for frequency of sql statements, most expensive ones, etc. (IronEye) -I am monitoring disk reads/writes using iostat -How do I measure commit frequency, and system events like checkpoint? (vacuum is done nightly during less or no load) Thanks, Anjan -----Original Message----- From: Aaron W [mailto:aaronwerman@yahoo.com] Sent: Thursday, March 04, 2004 8:58 AM To: pgsql-performance@postgresql.org; Anjan Dave Subject: Re: Scaling further up .... I'd look at adding more disks first. Depending on what type of query load you get, that box sounds like it will be very much I/O bound.... Given a a 13G database on a 12G system, with a low growth rate, it is likely that there is almost no I/O for most activities. The exception is checkpointing. The first thing I'd do is try to build a spreadsheet model of: - select frequency, and # logical and physical reads involved - insert/delete/update frequency, and # logical and physical read and writes involved - commit frequency, etc. (start out with simplistic assumptions, and do it for peak load) - system events (checkpoints, vacuum) I assume that the only high I/O you will see will be for logging. The RAID issue there is basically obviated by the sequential write nature of WAL. If that is the case, EMC is not the most cost effective or obvious solution - since the value they provide is mostly manageability for disaster recovery. The goal in this case is to write at the application max speed, and with mimimal latency. Any responsible battery backed up write through (mirrored) cached controller can do that for you. On the other hand, if your requests are not *all* trivial, you are going to test the hardware and scheduling algorithms of OS and pg. Even if 0.1% of 3,000 tps take a second - that ends up generating 3 seconds of load.... Any, even slightly, slow transactions will generate enormous queues which slow down everything. In most systems of this volume I've seen, the mix of activities is constantly invalidating cache, making L2 caching less important. Memory to CPU bus speed is a limiting factor, as well as raw CPU speed in processing the requests. Xeon is not a great architecture for this because of FSB contention; I suspect a 4-way will be completely FSB bottlenecked so a more than 4 way would likely not change performance. I would try to get a simple model/benchmark going and test against it. You should be talking to the big iron vendors for their take on your issues and get their capacity benchmarks. __________________________________ Do you Yahoo!? Yahoo! Search - Find what you're looking for faster http://search.yahoo.com
pgsql-performance by date: