On Thu, 2009-02-05 at 21:54 +0200, Heikki Linnakangas wrote:
> - If bgwriter is performing a restartpoint when recovery ends, the
> startup checkpoint will be queued up behind the restartpoint. And since
> it uses the same smoothing logic as checkpoints, it can take quite some
> time for that to finish. The original patch had some code to hurry up
> the restartpoint by signaling the bgwriter if
> LWLockConditionalAcquire(CheckPointLock) fails, but there's a race
> condition with that if a restartpoint starts right after that check. We
> could let the bgwriter do the checkpoint too, and wait for it, but
> bgwriter might not be running yet, and we'd have to allow bgwriter to
> write WAL while disallowing it for all other processes, which seems
> quite complex. Seems like we need something like the
> LWLockConditionalAcquire approach, but built into CreateCheckPoint to
> eliminate the race condition
Seems straightforward? Hold the lock longer.
> - If you perform a fast shutdown while startup process is waiting for
> the restore command, startup process sometimes throws a FATAL error
> which leads escalates into an immediate shutdown. That leads to
> different messages in the logs, and skipping of the shutdown
> restartpoint that we now otherwise perform.
Sometimes?
> - It's not clear to me if the rest of the xlog flushing related
> functions, XLogBackgroundFlush, XLogNeedsFlush and XLogAsyncCommitFlush,
> need to work during recovery, and what they should do.
XLogNeedsFlush should always return false InRecoveryProcessingMode().
The WAL is already in the WAL files, not in wal_buffers anymore.
XLogAsyncCommitFlush should contain Assert(!InRecoveryProcessingMode())
since it is called during a VACUUM FULL only.
XLogBackgroundFlush should never be called during recovery because the
WALWriter is never active in recovery. That should just be documented.
-- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support