Re: END_OF_RECOVERY shutdowns and ResetUnloggedRelations() - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: END_OF_RECOVERY shutdowns and ResetUnloggedRelations() |
Date | |
Msg-id | 20140925204118.GE16581@awork2.anarazel.de Whole thread Raw |
In response to | Re: END_OF_RECOVERY shutdowns and ResetUnloggedRelations() (Abhijit Menon-Sen <ams@2ndQuadrant.com>) |
Responses |
Re: END_OF_RECOVERY shutdowns and ResetUnloggedRelations()
Re: END_OF_RECOVERY shutdowns and ResetUnloggedRelations() |
List | pgsql-hackers |
Hi, On 2014-09-24 17:06:05 +0530, Abhijit Menon-Sen wrote: > Hi Andres, Robert. > > I've attached four patches here. Cool. Will review. > 1. Move the call to ResetUnloggedRelations(UNLOGGED_RELATION_INIT) to > earlier in StartupXLOG. > > 2. Inside that function, issue fsync()s for the main forks we create by > copying the _init fork. These two imo should definitely be backpatched. > 3. A small fixup to add a const to "typedef char *FileName", because the > earlier patch gave me warnings about discarding const-ness. This is > consistent with many other functions in fd.c that take const char *. I'm happy with consider that one for master (although I seem to recall having had a patch for it rejected?), but I don't think we want to do that in the backbranches. > 4. Issue an fsync() on the data directory at startup if we need to > perform crash recovery. I personally think this one should be backpatched too. Does anyone believe differently? > From d8726c06cdf11674661eac1d091cf7edd05c2a0c Mon Sep 17 00:00:00 2001 > From: Abhijit Menon-Sen <ams@2ndQuadrant.com> > Date: Wed, 24 Sep 2014 14:43:18 +0530 > Subject: Call ResetUnloggedRelations(UNLOGGED_RELATION_INIT) earlier > > We need to call this after recovery, but not after the END_OF_RECOVERY > checkpoint is written. If we crash after that checkpoint, for example > because of ENOSPC in PreallocXlogFiles, the checkpoint has already set > the ControlFile->state to DB_SHUTDOWNED, so we don't enter recovery > again at startup. > > Because we did call ResetUnloggedRelations(UNLOGGED_RELATION_CLEANUP) > in the first cleanup, this leaves the database with a bunch of _init > forks for unlogged relations, but no main forks, leading to scary > errors. > > See thread from 20140912112246.GA4984@alap3.anarazel.de for details. With a explanation. Shiny! > --- > src/backend/access/transam/xlog.c | 16 ++++++++-------- > 1 file changed, 8 insertions(+), 8 deletions(-) > > diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c > index 46eef5f..218f7fb 100644 > --- a/src/backend/access/transam/xlog.c > +++ b/src/backend/access/transam/xlog.c > @@ -6863,6 +6863,14 @@ StartupXLOG(void) > ShutdownWalRcv(); > > /* > + * Reset initial contents of unlogged relations. This has to be done > + * AFTER recovery is complete so that any unlogged relations created > + * during recovery also get picked up. > + */ > + if (InRecovery) > + ResetUnloggedRelations(UNLOGGED_RELATION_INIT); > + + * Also recovery shouldn't be regarded successful if the reset fails -* e.g. because of ENOSPC. > + > + /* > + * copy_file() above has already called pg_flush_data() on the > + * files it created. Now we need to fsync those files, because > + * a checkpoint won't do it for us while we're in recovery. > + */ + * Doing this in a separate pass is advantageous for performance reasons * because it allows the kernel to perform all the flushes at once. > + /* > + * If we need to perform crash recovery, we issue an fsync on the > + * data directory to try to ensure that any data written before the > + * crash are flushed to disk. Otherwise a power failure in the near > + * future might mean that earlier unflushed writes are lost, but the > + * more recent data written to disk from here on are persisted. > + */ > + > + if (ControlFile->state != DB_SHUTDOWNED && > + ControlFile->state != DB_SHUTDOWNED_IN_RECOVERY) > + fsync_fname(data_directory, true); > + > if (ControlFile->state == DB_SHUTDOWNED) > { > /* This is the expected case, so don't be chatty in standalone mode */ > -- Unless I miss something this isn't sufficient. We need to fsync the files in the data directory, not just the toplevel directory? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: