Re: commit dfda6ebaec67 versus wal_keep_segments - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: commit dfda6ebaec67 versus wal_keep_segments
Date
Msg-id 515C718A.6060409@vmware.com
Whole thread Raw
In response to Re: commit dfda6ebaec67 versus wal_keep_segments  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: commit dfda6ebaec67 versus wal_keep_segments
List pgsql-hackers
On 03.04.2013 18:58, Jeff Janes wrote:
> On Tue, Apr 2, 2013 at 10:08 PM, Jeff Janes<jeff.janes@gmail.com>  wrote:
>
>> This commit introduced a problem with wal_keep_segments:
>>
>> commit dfda6ebaec6763090fb78b458a979b558c50b39b
>
> The problem seems to be that the underflow warned about is happening,
> because the check to guard it was checking the wrong thing.  However, I
> don't really understand KeepLogSeg.  It seems like segno, and hence recptr,
> don't actually serve any purpose.

Hmm, the check is actually correct, but the assignment in the
else-branch isn't. The idea of KeepLogSeg is to calculate recptr -
wal_keep_segments, and assign that to *logSegNo. But only if *logSegNo
is not already < than the calculated value. Does the attached look
correct to you?

> At some point when it is over-pruning and recycling, it recyles the log
> files that are still needed for recovery, and if the database crashes at
> that point it will not recover because it can't find either the primary
> secondary checkpoint records.

So, KeepLogSeg incorrectly sets *logSegNo to 0, and CreateCheckPoint
decrements it, causing it to underflow to 2^64-1. Now RemoveOldXlogFiles
feels free to remove every WAL segment.

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Regex with > 32k different chars causes a backend crash
Next
From: Dean Rasheed
Date:
Subject: Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)