Re: WAL Optimisation - configuration and usage - Mailing list pgsql-performance
From | Simon Riggs |
---|---|
Subject | Re: WAL Optimisation - configuration and usage |
Date | |
Msg-id | 003501c3ffed$33422210$5baa87d9@LaptopDellXP Whole thread Raw |
In response to | WAL Optimisation - configuration and usage (Rob Fielding <rob@dsvr.net>) |
Responses |
Re: WAL Optimisation - configuration and usage
Re: WAL Optimisation - configuration and usage |
List | pgsql-performance |
>Rob Fielding wrote: > My focus today has been on WAL - I've not looked at WAL before. By > increasing the settings thus : > > wal_buffers = 64 # need to determin WAL usage > wal_files = 64 # range 0-64 > wal_sync_method = fsync # the default varies across platforms: > wal_debug = 0 # range 0-16 > > # hopefully this should see less LogFlushes per LogInsert - use more WAL > though. > commit_delay = 10000 # range 0-100000 > commit_siblings = 2 # range 1-1000 > checkpoint_segments = 16 # in logfile segments (16MB each), min 1 > checkpoint_timeout = 300 # in seconds, range 30-3600 > fsync = true > But I think we can get more out of this as the above setting were picked > from thin air and my concern here is being able to determin WAL file > usage and if the system is caught out on the other extreme that we're > not commiting fast enough. Currently I've read that WAL files shouldn't > be more than 2*checkpoint_segments+1 however my pg_xlog directory > contains 74 files. This suggests I'm using more logfiles than I should. > Also I'm not sure what wal_buffers really should be set to. As Richard Huxton says, we're all learning...I'm looking at WAL logic for other reasons right now... This is based upon my reading of the code; I think the manual contains at least one confusion that has not assisted your understanding (or mine): The WAL files limit of 2*checkpoint_segments+1 refers to the number of files allocated-in-advance of the current log, not the total number of files in use. pg uses a cycle of logs, reusing older ones when all the transactions in those log files have been checkpointed. The limit is set to allow checkpoint to release segments and have them all be reused at once. Pg stores them up for use again later when workload hots up again. If it cannot recycle a file because there is a still-current txn on the end of the cycle, then it will allocate a new file and use this instead, but still keeping everything in a cycle. Thus if transactions are particularly long running, then the number of files in the cycle will grow. So overall, normal behaviour so far. I don't think there's anything to worry about in having that many files in your xlog cycle. That behaviour is usually seen with occasional long running txns. When a long running transaction is over, pg will try to reduce the number of files in the cycle until its back to target. You seem to be reusing one file in the cycle every 10 mins - this is happening as the result of a checkpoint timeout - "kinda automated" as you say. [A checkpoint is the only time you can get the messages you're getting] At one file per checkpoint, it will take 16*2+1=33 checkpoints*10 mins = 5 hours before it hits the advance allocation file limit and then starts to reduce number of files. That's why they appear to stay constant... If you want to check whether this is correct, manually issue a number of CHECKPOINT statements. The messages should change from "recycled" to "removing" transaction log file once you've got to 33 checkpoints - the number of WAL log files should start to go down also? If so, then there's nothing too strange going on, just pg being a little slow in reducing the number of wal log files. So, it seems that you are running occasional very long transactions. During that period you run up to 60-80 wal files. That's just on the edge of your wal_buffers limit, which means you start to write wal quicker than you'd like past that point. Your checkpoint_timeout is 300 seconds, but a checkpoint will also be called every checkpoint_segments, or currently every 16 wal files. Since you go as high as 60-80 then you are checkpointing 4-5 times during the heavy transaction period - assuming it's all one block of work. In the end, each checkpoint is causing a huge I/O storm, during which not much work happens. I would suggest that you reduce the effect of checkpointing by either: - re-write app to do scan deletes in smaller chunks in quieter periods or - increase checkpoint_segments to 128, though this may effect your recoverability You can of course only do so much with the memory available to you. If you increase one allocation of memory, you may have to reduce another parameter and that may be counter productive. [An alternative view is that you should go for more frequent, not less frequent checkpoints in this situation, smoothing out the effect of the checkpoints, rather than trying to avoid them at all. On the other hand, that approach also increases total WAL log volume, which means you'll make poor use of I/O and memory buffering. I'd stay high.] However, I'm not sure - why checkpoint interval of 300 secs causes them to happen every 10 mins in quieter periods; is that an occaisional update occurring? - why checkpoint only releases single Wal file each time - but that maybe me just reading the code incorrectly. Please set WAL_DEBUG to 1 so we can see a bit more info: thanks. > Can I get any feedback on this ? How to look into pg's WAL usage would > be what I'm looking for. BTW this is an old install I'm afraid 7.2.2 - > it's been impossible to upgrade up until now because it's been too slow. > I have moved the pg_xlog onto the root SCSI disk - it doesn't appear to > have made a huge difference but it could be on the same cable. My advice is don't touch WAL_SYNC_METHOD... I **think** the WAL behaviour is still the same in 7.4.1, so no rush to upgrade on that account - unless you're using temporary tables.... Best Regards, Simon Riggs
pgsql-performance by date: