On 13.01.2011 10:28, Fujii Masao wrote:
> When the master shuts down or crashes, there seems to be
> the case where walreceiver exits without flushing WAL which
> has already been written. This might lead startup process to
> replay un-flushed WAL and break a Write-Ahead-Logging rule.
Hmm, that can happen at a crash even with no replication involved. If
you "kill -9 postmaster", and some WAL had been written but not fsync'd,
on crash recovery we will happily recover the unsynced WAL. We could
prevent that by fsyncing all WAL before applying it - presumably
fsyncing a file that has already been flushed is quick. But is it worth
the trouble?
> walreceiver.c
>> /* Wait a while for data to arrive */
>> if (walrcv_receive(NAPTIME_PER_CYCLE,&type,&buf,&len))
>> {
>> /* Accept the received data, and process it */
>> XLogWalRcvProcessMsg(type, buf, len);
>>
>> /* Receive any more data we can without sleeping */
>> while (walrcv_receive(0,&type,&buf,&len))
>> XLogWalRcvProcessMsg(type, buf, len);
>>
>> /*
>> * If we've written some records, flush them to disk and let the
>> * startup process know about them.
>> */
>> XLogWalRcvFlush();
>> }
>
> The problematic case happens when the latter walrcv_receive
> emits ERROR. In this case, the WAL received by the former
> walrcv_receive is not guaranteed to have been flushed yet.
>
> The attached patch ensures that all WAL received is flushed to
> disk before walreceiver exits. This patch should be backported
> to 9.0, I think.
Yeah, we probably should do that, even though it doesn't completely
close the window tahat unsynced WAL is replayed.
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com