Thread: archive_timeout?
If archive_timeout is set to non 0, it seems an archive log segment is created every time checkpoint occurs even there's no database updation. This leads to creating 16MB log segment files every 5 minutes (default checkpoint period), which will in turn produce 4.6GB log segments with bogus data. Is this normal? This is PostgreSQL 8.2 beta1. -- Tatsuo Ishii SRA OSS, Inc. Japan
> If archive_timeout is set to non 0, it seems an archive log segment is > created every time checkpoint occurs even there's no database > updation. This leads to creating 16MB log segment files every 5 > minutes (default checkpoint period), which will in turn produce 4.6GB > log segments with bogus data. Is this normal? I mean 4.6GB per day. > This is PostgreSQL 8.2 beta1. > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend >
Tatsuo Ishii <ishii@sraoss.co.jp> writes: > If archive_timeout is set to non 0, it seems an archive log segment is > created every time checkpoint occurs even there's no database > updation. This leads to creating 16MB log segment files every 5 > minutes (default checkpoint period), which will in turn produce 4.6GB > log segments with bogus data. Is this normal? Yeah, that was intentional, see discussion a few weeks ago. regards, tom lane
On Tue, 2006-10-10 at 22:26 +0900, Tatsuo Ishii wrote: > If archive_timeout is set to non 0, it seems an archive log segment is > created every time checkpoint occurs even there's no database > updation. This leads to creating 16MB log segment files every 5 > minutes (default checkpoint period), which will in turn produce 4.6GB > log segments with bogus data. Is this normal? > > This is PostgreSQL 8.2 beta1. If the WAL is pretty much empty, gzip brings it from 16MB down to about 16KB, which is much more reasonable. I've noticed that even when idle there are a few files that seem to compress only to about 32KB, and some only to 880KB. I don't know exactly why those files are different, perhaps something with the stats collector? Autovacuum was off for this test. There should be a documentation note to let people know that the archive will grow even when idle. Perhaps we should suggest compression in the docs so that people don't get worried about many gigabytes of mostly- empty files filling up their backup storage. Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > There should be a documentation note to let people know that the archive > will grow even when idle. Perhaps we should suggest compression in the > docs so that people don't get worried about many gigabytes of mostly- > empty files filling up their backup storage. Actually, per the previous discussion: if you want to reduce WAL traffic then one of the most important things to do is stretch out checkpoint_timeout. regards, tom lane
On Tue, 2006-10-10 at 13:12 -0400, Tom Lane wrote: > Jeff Davis <pgsql@j-davis.com> writes: > > There should be a documentation note to let people know that the archive > > will grow even when idle. Perhaps we should suggest compression in the > > docs so that people don't get worried about many gigabytes of mostly- > > empty files filling up their backup storage. > > Actually, per the previous discussion: if you want to reduce WAL traffic > then one of the most important things to do is stretch out > checkpoint_timeout. > I assume you refer to this message: http://archives.postgresql.org/pgsql-hackers/2006-08/msg01190.php I understand that stretching the checkpoint timeout is useful if you have steady traffic and want to reduce the WAL volume. Higher checkpoint intervals mean fewer copies of data pages (at least before 8.2), and probably other data necessary at checkpoint. However, if you have a database with long idle times, higher checkpoint intervals combined with archive_timeout can still waste a lot of data (unless you stretch out the checkpoint timeout by orders of magnitude). This situation is also most the most useful situation for archive_timeout. If someone is concerned about idle time eating up gigabytes of backup storage, compression seems like a logical choice. Maybe I just don't understand checkpoint timeout? Could it reasonably be set to something like 12 hours? I can't think why not, but the config default is 5 minutes, so I would be hesitant to change it by that much. Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > Maybe I just don't understand checkpoint timeout? Could it reasonably be > set to something like 12 hours? I can't think why not, but the config > default is 5 minutes, so I would be hesitant to change it by that much. The only constraining factor on it is how much WAL data are you willing to replay in order to recover from a crash. If you've got a low-volume database then replaying up to 12 hours' worth of activity might not be unacceptable. Also, if you have spikes of activity, then checkpoint_segments would kick in after a spike had generated X amount of data. So I don't see any strong reason why it couldn't be set much higher than archive_timeout. Now the other side of the coin is that if you do have a steady low level of activity then a small archive_timeout is still going to result in shipping lots of partially-filled WAL files. Compression might help some, but the bottom line is simply that archive_timeout isn't an efficient mechanism for dealing with low-volume databases. regards, tom lane