Re: UPDATES hang every 5 minutes - Mailing list pgsql-general
From | Greg Smith |
---|---|
Subject | Re: UPDATES hang every 5 minutes |
Date | |
Msg-id | Pine.GSO.4.64.0708101049290.6298@westnet.com Whole thread Raw |
In response to | Re: UPDATES hang every 5 minutes (Marc Rossi <marc_rossi@yahoo.com>) |
List | pgsql-general |
On Fri, 10 Aug 2007, Marc Rossi wrote: > Thanks for the heads up. The box in question is a dual cpu (xeon dual > cores) with 8 gig & a pair of 10k 146gb raid 1 arrays. I have the > pg_xlog dir on one array (along with the OS) & the rest of the data on > the other array by itself. Yeah, that's kinda what I thought. Here's a quick way to think about the background writer parameters. Pages are 8KB, and the default bgwriter_delay will execute every 200ms or 5 times per second. If you set one of the maxpages parameters to 100, that means that component of the background writer might write as much as 8KB*100*5=4MB/sec worth of data to the disk continuously. The config you were copying had the maxpages parameters set to 200 and 600, which meant that in a heavy update situation the background writer might be writing out (2+6)*4=32MB/second *on top of* all the other reads ands writes going on. That's a pretty substantial additional load to add to just two pairs of disks. Also, anything written by the all-scan writer (which was by far the more aggressive one in that setup) has the potential to be a wasted write if that particular page gets changed again before the next checkpoint, which sounds highly likely given how you described your workload. Making those parameters too high will actually make the system less efficient--and that can make checkpoints worse. > A couple of quick questions. On the fly I can change these params and > use 'pg_ctl reload" to put them in effect, correct? I believe all of the ones you'll be touching can be adjusted that way. To confirm a change took, pop into psql and do "show <parameter>" to see the current value. > Also, I have my checkpoint_segments set to 10, if I were to lower this > (say 5) would this possible have the effect of checkpointing a little > more often with less data? (right now I hit the checkpoint_timeout). You're thinking in the right terms here. It's possible that by forcing more frequent checkpoints, the impact of each of them will be reduced. However, the net will be a less efficient system, because checkpointing is intensive. And having twice as many of them doubles how often you get this spike in response times. The other approach is to increase the timeout and the segments, so maybe you're only running into this every 10 minutes or more which makes the problem less annoying, then try to keep the buffers clean between checkpoints using the background writer. What's shared_buffers set to on this system? A third possibility is to reduce that and rely more on the operating system to buffer the data for you. If there's less data in the buffer cache, it will certainly take less time to write things out at checkpoint time; there's obviously downsides to that in terms of how fast regular queries execute. It will take some experimentation here to get this right. Checkpoint problems like you're running into aren't something you knock out in a day if you're new to this. Try not to make too big a step at any time and expect you'll have to dig into this a bit. I would recommend you start by intalling the contrib/pg_buffercache module against your database (read README.pg_buffercache for an intro), which lets you watch what's in the buffer cache at any time, so you can tell the balance of clean vs. dirty pages and what tables they're in. That will let you monitor how effective the background writer is doing and estimate how bad the checkpoint is going to be before it happens. Here's a sample query to get you started: SELECT c.relname, isdirty, count(*) AS buffers FROM pg_class c INNER JOIN pg_buffercache b ON b.relfilenode = c.relfilenode INNER JOIN pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database()) GROUP BY c.relname,isdirty ORDER BY 2 DESC; I have something I'm working on that covers a lot of this topic at http://developer.postgresql.org/index.php/Buffer_Cache%2C_Checkpoints%2C_and_the_BGW but that's probably a little too low-level for you to chew on usefully right now. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
pgsql-general by date: