Home > mailing lists

BUG #7902: lazy cleanup of extraneous WAL files can cause out of disk issues - Mailing list pgsql-bugs

From	jeff@pgexperts.com
Subject	BUG #7902: lazy cleanup of extraneous WAL files can cause out of disk issues
Date	February 23, 2013 01:55:29
Msg-id	E1U91WW-0006rq-82@wrigleys.postgresql.org Whole thread Raw
Responses	Re: BUG #7902: lazy cleanup of extraneous WAL files can cause out of disk issues (Rafael Martinez Guerrero <r.m.guerrero@usit.uio.no>) Re: BUG #7902: lazy cleanup of extraneous WAL files can cause out of disk issues (Jeff Janes <jeff.janes@gmail.com>)
List	pgsql-bugs

Tree view

The following bug has been logged on the website:

Bug reference:      7902
Logged by:          Jeff Frost
Email address:      jeff@pgexperts.com
PostgreSQL version: 9.2.3
Operating system:   Ubuntu 12.04
Description:        =


While doing acceptance testing on a new Ubuntu 12.04 PostgreSQL server
running 9.2.3, we set checkpoint_segments =3D 128,
checkpoint_completion_target =3D 0.9 and placed pg_xlog on a separate 20G
partition. Also, archive_mode =3D off on this system.

According to the docs, you would expect the system to attempt to keep the
WAL files down close to 3 * checkpoint_segments + 1.  Unfortunately, this
does not appear to be the case because a pgbench run would run the pg_xlog
partition out of space.

The pgbench run script looks like this:

#!/bin/bash

dropdb bench
createdb bench
pgbench -i -s 1000 bench
vacuumdb -a --analyze-only
psql -c "checkpoint"
pgbench -c 64 -j 16 -r -T 600 bench

While the pgbench does cause lots of xlog based checkpoints, they never seem
to remove more than a few files and often pg_xlog grows to more than 20G and
the postgresql service falls over.

After moving pg_xlog to a larger partition, it seems it peaks at about 22G
in size. =


A manual checkpoint after the run always brings it back down to ~ 4G in
size.

Interestingly, I was unable to reproduce this with 9.2.3 on our inhouse test
system; however, the inhouse system has much less RAM and CPU resources, so
this may only be an issue on larger systems. The system that exhibits the
issue has 128G of RAM and 16 cores (32 with hyperthreading). =


I also tested 9.2.2 on the affected system and it acted the same.

Hope to test 9.1.8 in the next few days.

pgsql-bugs by date:

From: Tom Lane
Date: 22 February 2013, 21:34:54
Subject: Re: new BUG: "postgresql 9.2.3: very long query time"

From: James R Skaggs
Date: 23 February 2013, 17:05:11
Subject: Re: BUG #7853: Incorrect statistics in table with many dead rows.

BUG #7902: lazy cleanup of extraneous WAL files can cause out of disk issues - Mailing list pgsql-bugs

Previous

Next