Re: Maximum number of WAL files in the pg_xlog directory - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Maximum number of WAL files in the pg_xlog directory
Date
Msg-id CAMkU=1yfiJh-UYcujka9dKZKEm+6U=DyvJaM8+ABeBQ+bAdWLQ@mail.gmail.com
Whole thread Raw
In response to Re: Maximum number of WAL files in the pg_xlog directory  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Maximum number of WAL files in the pg_xlog directory
List pgsql-hackers
On Mon, Oct 13, 2014 at 12:11 PM, Bruce Momjian <bruce@momjian.us> wrote:

I looked into this, and came up with more questions.  Why is
checkpoint_completion_target involved in the total number of WAL
segments?  If checkpoint_completion_target is 0.5 (the default), the
calculation is:

        (2 + 0.5) * checkpoint_segments + 1

while if it is 0.9, it is:

        (2 + 0.9) * checkpoint_segments + 1

Is this trying to estimate how many WAL files are going to be created
during the checkpoint?  If so, wouldn't it be (1 +
checkpoint_completion_target), not "2 +".  My logic is you have the old
WAL files being checkpointed (that's the "1"), plus you have new WAL
files being created during the checkpoint, which would be
checkpoint_completion_target * checkpoint_segments, plus one for the
current WAL file.

WAL is not eligible to be recycled until there have been 2 successful checkpoints.

So at the end of a checkpoint, you have 1 cycle of WAL which has just become eligible for recycling,
1 cycle of WAL which is now expendable but which is kept anyway, and checkpoint_completion_target worth of WAL which has occurred while the checkpoint was occurring and is still needed for crash recovery.

I don't really understand the point of this way of doing things.  I guess it is because the control file contains two redo pointers, one for the last checkpoint, and one for the previous to that checkpoint, and if recovery finds that it can't use the most recent one it tries the ones before that.  Why?  Beats me.  If we are worried about the control file getting a corrupt redo pointer, it seems like we would record the last one twice, rather than recording two different ones once each.  And if the in-memory version got corrupted before being written to the file, I really doubt anything is going to save your bacon at that point.

I've never seen a case where recovery couldn't use the last recorded good checkpoint, so instead used the previous one, and was successful at it.  But then again I haven't seen all possible crashes.

This is based on memory from the last time I looked into this, I haven't re-verified it so could be wrong or obsolete.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Lucas Lersch
Date:
Subject: Buffer Requests Trace
Next
From: Stephen Frost
Date:
Subject: Re: Buffer Requests Trace