Re: hot standby startup, visibility map, clog - Mailing list pgsql-hackers

From Robert Haas
Subject Re: hot standby startup, visibility map, clog
Date
Msg-id BANLkTimuRNqooyVn+UgXSe6gQcHaSJr19g@mail.gmail.com
Whole thread Raw
In response to hot standby startup, visibility map, clog  (Daniel Farina <daniel@heroku.com>)
List pgsql-hackers
On Thu, Jun 9, 2011 at 5:14 AM, Daniel Farina <daniel@heroku.com> wrote:
> The first fact is that turning off hot standby will let the cluster
> start up, but only after seeing a spate of messages like these (dozen
> or dozens, not thousands):
>
> 2011-06-09 08:02:32 UTC  LOG:  restored log file
> "000000020000002C000000C0" from archive
> 2011-06-09 08:02:33 UTC  WARNING:  xlog min recovery request
> 2C/C1F09658 is past current point 2C/C037B278
> 2011-06-09 08:02:33 UTC  CONTEXT:  writing block 0 of relation
> base/16385/16784_vm
>        xlog redo insert: rel 1663/16385/128029; tid 114321/63
> 2011-06-09 08:02:33 UTC  LOG:  restartpoint starting: xlog
>
> Most importantly, *all* such messages are in visibility map forks
> (_vm).  I reasonably confident that my code does not start reading
> data until pg_start_backup() has returned, and blocks on
> pg_stop_backup() after having read all the data.  Also, the mailing
> list correspondence at
> http://archives.postgresql.org/pgsql-hackers/2010-11/msg02034.php
> suggests that the visibility map is not flushed at checkpoints, so
> perhaps with some poor timing an old page can wander onto disk even
> after a checkpoint barrier that pg_start_backup waits for. (I have not
> yet found the critical section that makes visibilitymap buffers immune
> to checkpoint though).

I don't believe there's anything that makes visibilitymap buffers
immune to checkpoint.  I might be wrong.  You could probably find out
by inserting a few lines of code into the buffer manager to inspect
each buffer tag as the buffer is written out and elog() if the fork
number is VISIBILITYMAP_FORKNUM.  There is some weirdness around how
the LSNs of visibility map pages are set, however.  See
visibilitymap_set()/clear().  Clearing a visibility map bit is done
without touching the LSN at all; setting the bit advances the LSN to
that of the heap page, if it currently precedes that value.  It's not
obvious to me whether or how that's related.

I'm a bit puzzled by the warning message itself, too.  I think that in
order to generate that warning, you need to try to flush a page to
disk whose LSN is higher than the current minimum recovery point.  And
the minimum recovery point is initially set to point to the redo
pointer from the last checkpoint.  But during an online backup, it's
expected that there may be pages on disk with LSNs higher than the
redo pointer from the last checkpoint.  So actually now I'm not sure
why people don't see this warning whenever a lengthy recovery is
required to reach consitency, especially with small values of
shared_buffers.  I'm probably missing something here...

One thing that might make it easier to understand what's going on here
is to crank the log level up to DEBUG2 and post the full output from a
recovery that exhibits this problem, rather than just a few lines.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Cédric Villemain
Date:
Subject: Re: procpid?
Next
From: Robert Haas
Date:
Subject: Re: procpid?