Re: "PANIC: cannot make new WAL entries during recovery" in the wild - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: "PANIC: cannot make new WAL entries during recovery" in the wild
Date
Msg-id 4A7C733E.2010501@enterprisedb.com
Whole thread Raw
In response to Re: "PANIC: cannot make new WAL entries during recovery" in the wild  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: "PANIC: cannot make new WAL entries during recovery" in the wild  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Today we got a report in the spanish list about the message in $subject.
>> The server is 8.4 running on Windows.
> 
> I accidentally managed to reproduce this in HEAD just now, by kill -9'ing
> a backend that was in the midst of a COPY IN operation (I was trying to
> reproduce Neil Best's unrelated issue...)  The server log is

You're lucky. I once tried to trigger the rm_cleanup() code with
repeated "killall -9 postmaster", but failed. IIRC I just put an abort()
at the right place in the end.

> So that confirms my speculation that btree index cleanup is the source
> of the message.  We have two basic approaches to dealing with it:
> 
> 1. Decide that the check added to XLogInsert is wrong and take it out.
> 
> 2. Arrange for some sort of explicit state transition between the
> WAL-reading and cleanup phases of recovery, and make sure XLogInsert
> knows about it.

I'd suggest we temporarily allow XLog insertion by calling
LocalSetXLogInsertAllowed() just before the rm_cleanup() loop, and reset
it with "LocalXLogInsertAllowed = -1" just after the loop. Like we do at
the startup checkpoint. The sanity check still feels very useful to me,
I'd hate to lose it.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Fixing geometic calculation
Next
From: Sam Mason
Date:
Subject: Re: Fixing geometic calculation