Home > mailing lists

Re: warning message in standby - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: warning message in standby
Date	June 10, 2010 13:01:23
Msg-id	4C110C41.7030607@enterprisedb.com Whole thread Raw
In response to	Re: warning message in standby (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: warning message in standby Re: warning message in standby Re: warning message in standby
List	pgsql-hackers

Tree view

On 10/06/10 17:38, Tom Lane wrote:
> Robert Haas<robertmhaas@gmail.com>  writes:
>> On Mon, Jun 7, 2010 at 9:21 AM, Fujii Masao<masao.fujii@gmail.com>  wrote:
>>> When an error is found in the WAL streamed from the master, a warning
>>> message is repeated without interval forever in the standby. This
>>> consumes CPU load very much, and would interfere with read-only queries.
>>> To fix this problem, we should add a sleep into emode_for_corrupt_record()
>>> or somewhere? Or we should stop walreceiver and retry to read WAL from
>>> pg_xlog or the archive?
>
>> I ran into this problem at one point, too, but was in the middle of
>> trying to investigate a different bug and didn't have time to track
>> down what was causing it.
>
>> I think the basic question here is - if there's an error in the WAL,
>> how do we expect to EVER recover?  Even if we can read from the
>> archive or pg_xlog, presumably it's the same WAL - why should we be
>> any more successful the second time?
>
> What "warning message" are we talking about?  All the error cases I can
> think of in WAL-application are ERROR, or likely even PANIC.

We're talking about a corrupt record (incorrect CRC, incorrect backlink 
etc.), not errors within redo functions. During crash recovery, a 
corrupt record means you've reached end of WAL. In standby mode, when 
streaming WAL from master, that shouldn't happen, and it's not clear 
what to do if it does. PANIC is not a good idea, at least if the server 
uses hot standby, because that only makes the situation worse from 
availability point of view. So we log the error as a WARNING, and keep 
retrying. It's unlikely that the problem will just go away, but we keep 
retrying anyway in the hope that it does. However, it seems that we're 
too aggressive with the retries.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

pgsql-hackers by date:

From: Greg Stark
Date: 10 June 2010, 13:01:04
Subject: Re: How about closing some Open Items?

From: Robert Haas
Date: 10 June 2010, 13:13:18
Subject: Re: warning message in standby

Re: warning message in standby - Mailing list pgsql-hackers

Previous

Next