Re: BUG #4879: bgwriter fails to fsync the file in recovery mode - Mailing list pgsql-bugs

From Simon Riggs
Subject Re: BUG #4879: bgwriter fails to fsync the file in recovery mode
Date
Msg-id 1245989692.4038.333.camel@ebony.2ndQuadrant
Whole thread Raw
In response to Re: BUG #4879: bgwriter fails to fsync the file in recovery mode  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #4879: bgwriter fails to fsync the file in recovery mode
List pgsql-bugs
On Thu, 2009-06-25 at 20:25 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On Thu, 2009-06-25 at 19:16 -0400, Tom Lane wrote:
> >> Simon Riggs <simon@2ndQuadrant.com> writes:
> >>> On the patch, the kluge to set InRecovery is unnecessary and confusing.
> >>
> >> I'll look into a better way to do that tonight.  Did you have any
> >> specific improvement in mind?
>
> > Yes, all already mentioned on this thread.
>
> After looking at this a bit more, the "easy" solution mentioned upthread
> doesn't work.  We want the end-of-recovery checkpoint to be able to
> write WAL (obviously), but it *must not* try to call TruncateMultiXact
> or TruncateSUBTRANS, because those subsystems aren't initialized yet.
> The check in XLogFlush needs to behave as though InRecovery were true
> too.  So the idea of testing RecoveryInProgress() rather than InRecovery
> cannot handle all these cases.

Yes it can, but only if LocalRecoveryInProgress is set immediately
before XLogInsert() in CreateCheckpoint(). Then it is correct

> However, I still agree with your thought that having InRecovery true
> only in the process that's applying WAL records is probably the cleanest
> definition.  And it also seems to me that having crystal-clear
> definitions of these states is essential, because not being clear about
> them is exactly what got us into this mess.

Yes

> What I am thinking is that instead of the hack of clearing
> LocalRecoveryInProgress to allow the current process to write WAL,
> we should have a separate test function WALWriteAllowed() with a
> state variable LocalWALWriteAllowed, and the hack should set that
> state without playing any games with LocalRecoveryInProgress.  Then
> RecoveryInProgress() remains true during the end-of-recovery checkpoint
> and we can condition the TruncateMultiXact and TruncateSUBTRANS calls
> on that.  Meanwhile the various places that check RecoveryInProgress
> to decide if WAL writing is allowed should call WALWriteAllowed()
> instead.

No need.

> I have not yet gone through all the code sites to make sure this is a
> consistent way to do it, but we clearly need more states than we have
> now if we are to avoid weird overloadings of the state meanings.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

pgsql-bugs by date:

Previous
From: "Chu, William"
Date:
Subject: Re: BUG #4884: Misleading error message
Next
From: Simon Riggs
Date:
Subject: Re: BUG #4879: bgwriter fails to fsync the file in recovery mode