Thread: Weird WAL problem - 9.0.3

Weird WAL problem - 9.0.3

From
Rafael Martinez
Date:
Hello

Yesterday we had a weird problem with the pg_xlog partition in one of
our servers:

- The amount of WAL files was much higher than (2*checkpoint_segments)+1
(over 360 WAL files)

- The WAL files were not created/recycle time-ordered. Here is an
example:

.....................
16777216 Apr 12 17:49 000000010000000D0000001C
16777216 Apr 12 17:49 000000010000000D0000001D
16777216 Apr 12 17:49 000000010000000D0000001E
16777216 Apr 12 17:52 000000010000000D0000001F
16777216 Apr 12 17:50 000000010000000D00000020
16777216 Apr 12 17:51 000000010000000D00000021
16777216 Apr 12 17:49 000000010000000D00000022
16777216 Apr 12 17:49 000000010000000D00000023
16777216 Apr 12 17:49 000000010000000D00000024
16777216 Apr 12 17:51 000000010000000D00000025
.....................

This is the first time I see this behavior with the result of a full
pg_xlog partition.

This happened when testing an upgrade procedure and moving on the fly
with "pg_dumpall | psql" around 30 databases (ca.140GB) from a 8.3
server to a 9.0 one.

Is this normal? If it is, how can we find out the max.number of WAL
files a 9.0 system can generate in the worst case scenario?

Some relevant information about this system:
------------------------------------------------
PostgreSQL 9.0.3 - ext4 - RHEL5.6 - 2.6.18-238.5.1.el5 - x86_64

checkpoint_segments: 128
wal_buffers: 512kB
wal_level: archive
wal_sync_method: fdatasync
shared_buffers: 10GB
------------------------------------------------

regards,
--
Rafael Martinez Guerrero
Center for Information Technology
University of Oslo, Norway

PGP Public Key: http://folk.uio.no/rafael/


Attachment

Re: Weird WAL problem - 9.0.3

From
Adrian Klaver
Date:

On Wednesday, April 13, 2011 6:09:25 am Rafael Martinez wrote:

> Hello

>

> Yesterday we had a weird problem with the pg_xlog partition in one of

> our servers:

>

> - The amount of WAL files was much higher than (2*checkpoint_segments)+1

> (over 360 WAL files)

>

Might want to take a look at:

http://www.postgresql.org/docs/9.0/interactive/wal-configuration.html

In particular:

"There will always be at least one WAL segment file, and will normally not be more files than the higher of wal_keep_segments or (2 + checkpoint_completion_target) * checkpoint_segments + 1. Each segment file is normally 16 MB (though this size can be altered when building the server). You can use this to estimate space requirements for WAL. Ordinarily, when old log segment files are no longer needed, they are recycled (renamed to become the next segments in the numbered sequence). If, due to a short-term peak of log output rate, there are more than 3 * checkpoint_segments + 1 segment files, the unneeded segment files will be deleted instead of recycled until the system gets back under this limit."

--

Adrian Klaver

adrian.klaver@gmail.com

Re: Weird WAL problem - 9.0.3

From
Rafael Martinez
Date:
On Wed, 2011-04-13 at 06:28 -0700, Adrian Klaver wrote:
> On Wednesday, April 13, 2011 6:09:25 am Rafael Martinez wrote:

>
> Might want to take a look at:
>
[......]
>  sequence). If, due to a short-term peak of log output rate, there are
> more than 3 * checkpoint_segments + 1 segment files, the unneeded
> segment files will be deleted instead of recycled until the system
> gets back under this limit."
>

Thank you.

This explains the number of WAL files generated, I was not aware of the
(3*checkpoint_segments + 1) limit (I will RTBM better next time)

But this doesn't explain the WAL files not been created/recycled
time-ordered. I wonder if this happened because the partition got full
while the WALs were created/recycled?

regards,
--
Rafael Martinez Guerrero
Center for Information Technology
University of Oslo, Norway

PGP Public Key: http://folk.uio.no/rafael/


Attachment

Re: Weird WAL problem - 9.0.3

From
Tom Lane
Date:
Rafael Martinez <r.m.guerrero@usit.uio.no> writes:
> But this doesn't explain the WAL files not been created/recycled
> time-ordered. I wonder if this happened because the partition got full
> while the WALs were created/recycled?

When a checkpoint finishes, it scans the pg_xlog directory to find WAL
files that are no longer needed (because they're before the checkpoint's
WAL replay point).  It will either rename them "forward" to become ready
for future use, or delete them if there are already enough future WAL
files present (as determined by checkpoint_segments).  The order in
which old segments get renamed to be future ones is basically chance,
because it's determined by the order in which readdir() visits them.
So there's no reason to think that their file timestamps will be in
order.

I would expect WAL files that are *behind* the current write point to
have increasing write timestamps.  But not those ahead.

            regards, tom lane