Re: Question on hardware & server capacity - Mailing list pgsql-performance
From | Hannu Krosing |
---|---|
Subject | Re: Question on hardware & server capacity |
Date | |
Msg-id | 1041545254.2176.28.camel@rh72.home.ee Whole thread Raw |
In response to | Question on hardware & server capacity ("Steve Wolfe" <nw@codon.com>) |
List | pgsql-performance |
Steve Wolfe kirjutas N, 02.01.2003 kell 22:42: > Well, our current database server is getting tremendously loaded, and > right now there isn't a clear-cut choice as to an upgrade path - at least > not within the commodity hardware market. Have you optimized your queries to max ? Often one or two of the queries take most of resources and starve others. > The machine is a dual AthlonMP 2000, with 2 gigs of RAM. the loads on > the machine are getting out of hand, and performance is noticeably slowed. > 'top' shows the CPU's as being anywhere from 30% to 50% idle, with (on > average) 5-10 postmasters in the "non-idle" state. 'vmstat' shows bi/bo > pegged at zero (copious quantities of disk cache, fsync turned off), Could there be some unnecessary trashing between OS and PG caches ? How could this be detected ? > interrupts fluctuating between 200 and 1,000 per second (avg. is approx > 400), context switches between 1300 and 4500 (avg. is approx 2300). I > logged some queries, and found that in an average second, the machine > forks off 10 new backends, and responds to 50 selects and 3 updates. What are the average times for query responses ? Will running the same queries (the ones from the logs) serially run faster/slower/at the same speed ? Do you have some triggers on updates - I have occasionally found them to be real performance killers. Also - if memory bandwidth is the issue, you could tweak the parameters so that PG will prefer index scans more often - there are rumors that under heavy loads it is often better to use more index scans due to possible smaller memory/buffer use, even if they would be slower for only one or two backends. > My feelings are that the machine is being swamped by both the number of > context switches and the I/O, most likely the memory bandwidth. I'm > working on implementing some connection pooling to reduce the number of > new backends forked off, but there's not much I can do about the sheer > volume (or cost) of queries. You could try to replicate the updates (one master - multiple slaves) and distribute the selects. I guess this is what current postgreSQL state-of-the-art already lets you do with reasonable effort. > Now, if quad-Hammers were here, I'd simply throw hardware at it. > Unfortunately, they're not. Yes, it's BAD if your business grows faster than Moores law ;-p > So far, about the only commodity-level answer > I can think of would be a dual P4 Xeon, with the 533 MHz bus, and > dual-channel DDR memory. That would give each processor approximately > double the memory bandwidth over what we're currently running. > > I'm fairly sure that would at least help lower the load, but I'm not > sure by how much. If anyone has run testing under similar platforms, I'd > love to hear of the performance difference. How big is the dataset ? What kinds of queries ? I could perhaps run some quick tests on quad Xeon 1.40GHz , 2GB before this box goes to production sometime early next week. It is a RedHat AS2.1 box with rh-postgresql-7.2.3-1_as21. # hdparm -tT /dev/sda /dev/sda: Timing buffer-cache reads: 128 MB in 0.39 seconds =328.21 MB/sec Timing buffered disk reads: 64 MB in 1.97 seconds = 32.49 MB/sec > If this is going to chop the > loads in half, I'll do it. If it's only going to improve it by 10% or so, > I'm not going to waste the money. -- Hannu Krosing <hannu@tm.ee>
pgsql-performance by date: