Thread: pgsql: rm_cleanup functions need to be allowed to write WAL entries.

pgsql: rm_cleanup functions need to be allowed to write WAL entries.

From
tgl@postgresql.org (Tom Lane)
Date:
Log Message:
-----------
rm_cleanup functions need to be allowed to write WAL entries.  This oversight
appears to explain the recent reports of "PANIC: cannot make new WAL entries
during recovery".

Modified Files:
--------------
    pgsql/src/backend/access/transam:
        xlog.c (r1.345 -> r1.346)
        (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/transam/xlog.c?r1=1.345&r2=1.346)

Re: pgsql: rm_cleanup functions need to be allowed to write WAL entries.

From
Simon Riggs
Date:
On Fri, 2009-08-07 at 19:29 +0000, Tom Lane wrote:
> Log Message:
> -----------
> rm_cleanup functions need to be allowed to write WAL entries.  This oversight
> appears to explain the recent reports of "PANIC: cannot make new WAL entries
> during recovery".
>
> Modified Files:
> --------------
>     pgsql/src/backend/access/transam:
>         xlog.c (r1.345 -> r1.346)
>         (http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/access/transam/xlog.c?r1=1.345&r2=1.346)

I was just working on a patch after the comments yesterday, hadn't
noticed you'd committed.

The first chunk is exactly as I was going to suggest. Make an explicit
state change in the startup process.

Resetting it back seems fragile, since in crash recovery we call it
again almost immediately during CreateCheckPoint(). That only works if
LocalSetXLogInsertAllowed() has no side effects. I understand Heikki's
wish to have safeguards in place, so we should document that
LocalSetXLogInsertAllowed() can be executed twice without problem.

--
 Simon Riggs           www.2ndQuadrant.com


Re: pgsql: rm_cleanup functions need to be allowed to write WAL entries.

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> Resetting it back seems fragile, since in crash recovery we call it
> again almost immediately during CreateCheckPoint(). That only works if
> LocalSetXLogInsertAllowed() has no side effects. I understand Heikki's
> wish to have safeguards in place, so we should document that
> LocalSetXLogInsertAllowed() can be executed twice without problem.

Done.

My initial thought had actually been to get rid of the use of
LocalSetXLogInsertAllowed inside CreateCheckPoint, since with this
patch we are calling it from the same bit of code that is about
to call CreateCheckPoint --- leaving the flag set throughout that
bit would be fine.  However that would only work as intended if
the checkpoint were done locally; if somehow we'd launched the
bgwriter and the checkpoint request got sent over there, it'd fail.
I don't believe this is currently possible during a crash recovery
scenario, but on the whole it seemed more fragile to do it that way
than in the code as-committed.  In principle, at least, it seems
possible that the rm_cleanup and checkpoint actions could be done
in different processes, and this setup preserves the freedom to
let that happen.

            regards, tom lane

Re: pgsql: rm_cleanup functions need to be allowed to write WAL entries.

From
Simon Riggs
Date:
On Sat, 2009-08-08 at 12:46 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > Resetting it back seems fragile, since in crash recovery we call it
> > again almost immediately during CreateCheckPoint(). That only works if
> > LocalSetXLogInsertAllowed() has no side effects. I understand Heikki's
> > wish to have safeguards in place, so we should document that
> > LocalSetXLogInsertAllowed() can be executed twice without problem.
>
> Done.
>
> My initial thought had actually been to get rid of the use of
> LocalSetXLogInsertAllowed inside CreateCheckPoint, since with this
> patch we are calling it from the same bit of code that is about
> to call CreateCheckPoint --- leaving the flag set throughout that
> bit would be fine.  However that would only work as intended if
> the checkpoint were done locally; if somehow we'd launched the
> bgwriter and the checkpoint request got sent over there, it'd fail.
> I don't believe this is currently possible during a crash recovery
> scenario, but on the whole it seemed more fragile to do it that way
> than in the code as-committed.

OK

>  In principle, at least, it seems
> possible that the rm_cleanup and checkpoint actions could be done
> in different processes, and this setup preserves the freedom to
> let that happen.

Good.

I want to move in the direction of having two cleanup routines, one
executed before recovery ends and one done afterwards, so it can write
WAL. Perhaps these would be called rm_makesafe() and rm_repair(). Rough
thinking at this stage.

The rm_repair() would execute in a separate process once we're up.

--
 Simon Riggs           www.2ndQuadrant.com


Re: pgsql: rm_cleanup functions need to be allowed to write WAL entries.

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> I want to move in the direction of having two cleanup routines, one
> executed before recovery ends and one done afterwards, so it can write
> WAL. Perhaps these would be called rm_makesafe() and rm_repair(). Rough
> thinking at this stage.

> The rm_repair() would execute in a separate process once we're up.

Er, what's the point of that?  It would make life tremendously harder
for resource managers, which could no longer rely on tracking their
state locally within the startup process.  And AFAICS there is no
benefit to be had, compared to the existing plan of letting backends
run while the startup process is still active.


            regards, tom lane

Re: pgsql: rm_cleanup functions need to be allowed to write WAL entries.

From
Simon Riggs
Date:
On Sun, 2009-08-09 at 11:41 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > I want to move in the direction of having two cleanup routines, one
> > executed before recovery ends and one done afterwards, so it can write
> > WAL. Perhaps these would be called rm_makesafe() and rm_repair(). Rough
> > thinking at this stage.
>
> > The rm_repair() would execute in a separate process once we're up.
>
> Er, what's the point of that?

Rebuilding damaged indexes automatically, rather than barfing. I regard
that as a long term extension of crash recovery to bring a database back
to a usable state.

--
 Simon Riggs           www.2ndQuadrant.com


Re: pgsql: rm_cleanup functions need to be allowed to write WAL entries.

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> On Sun, 2009-08-09 at 11:41 -0400, Tom Lane wrote:
>> Er, what's the point of that?

> Rebuilding damaged indexes automatically, rather than barfing. I regard
> that as a long term extension of crash recovery to bring a database back
> to a usable state.

Having crash recovery auto-rebuild indexes it thinks are damaged seems
to me to be a pretty horrid idea.  Just for starters, it would overwrite
forensic evidence about the cause of the damage.  A DBA might not wish
the rebuild to happen *right then* in any case.

            regards, tom lane

Re: pgsql: rm_cleanup functions need to be allowed to write WAL entries.

From
Bruce Momjian
Date:
Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On Sun, 2009-08-09 at 11:41 -0400, Tom Lane wrote:
> >> Er, what's the point of that?
>
> > Rebuilding damaged indexes automatically, rather than barfing. I regard
> > that as a long term extension of crash recovery to bring a database back
> > to a usable state.
>
> Having crash recovery auto-rebuild indexes it thinks are damaged seems
> to me to be a pretty horrid idea.  Just for starters, it would overwrite
> forensic evidence about the cause of the damage.  A DBA might not wish
> the rebuild to happen *right then* in any case.

Are hash indexes going to need auto-rebuild, or can we make them
WAL-safe eventually?

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: pgsql: rm_cleanup functions need to be allowed to write WAL entries.

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Are hash indexes going to need auto-rebuild, or can we make them
> WAL-safe eventually?

I see no reason they couldn't be made WAL-safe.  Just no one has
bothered yet.

            regards, tom lane