Re: patch proposal - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: patch proposal
Date
Msg-id 20160816140639.GM4028@tamriel.snowman.net
Whole thread Raw
In response to Re: patch proposal  (Venkata B Nagothi <nag1010@gmail.com>)
Responses Re: patch proposal  (Venkata B Nagothi <nag1010@gmail.com>)
Re: patch proposal  (Michael Paquier <michael.paquier@gmail.com>)
Re: patch proposal  (Jaime Casanova <jaime.casanova@2ndquadrant.com>)
archive restore command failure status [was Re: patch proposal]  (Chapman Flack <chap@anastigmatix.net>)
List pgsql-hackers
Greetings,

* Venkata B Nagothi (nag1010@gmail.com) wrote:
> The above said parameters can be configured to pause, shutdown or prevent
> promotion only after reaching the recovery target point.
> To clarify, I am referring to a scenario where recovery target point is not
> reached at all ( i mean, half-complete or in-complete recovery) and there
> are lots of WALs still pending to be replayed - in this situation,

PG doesn't know that there are still WALs to be replayed.

> PostgreSQL just completes the archive recovery until the end of the last
> available WAL (WAL file "00000001000000000000001E" in my case) and
> starts-up the cluster by generating an error message (saying
> "00000001000000000000001F" not found).

That's not a PG error, that's an error from cp.  From PG's perspective,
your restore command has said that all of the WAL has been replayed.

If that's not what you want then change your restore command to return
an exit code > 125, which tells PG that it's unable to restore that WAL
segment.

> It would be nice if PostgreSQL pauses the recovery in-case its not complete
> (because of missing or corrupt WAL), shutdown the cluster and allows the
> DBA to restart the replay of the remaining WAL Archive files to continue
> recovery (from where it stopped previously) until the recovery target point
> is reached.

Reaching the end of WAL isn't an error and I don't believe it makes any
sense to treat it like it is.  You can specify any recovery target point
you wish, including ones that don't exist, and that's not an error
either.

I could see supporting an additional "pause" option that means "pause at
the end of WAL if you don't reach the recovery target point".  I'd also
be happy with a warning being emitted in the log if the recovery target
point isn't reached before reaching the end of WAL, but I don't think it
makes sense to change the existing behavior.

Thanks!

Stephen

pgsql-hackers by date:

Previous
From: Robert Eckhardt
Date:
Subject: Re: Declarative partitioning - another take
Next
From: Vladimir Sitnikov
Date:
Subject: Re: Slowness of extended protocol