Re: Usability improvements for pg_stop_backup() - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Usability improvements for pg_stop_backup()
Date
Msg-id 1407081347.79376.YahooMailNeo@web122303.mail.ne1.yahoo.com
Whole thread Raw
In response to Usability improvements for pg_stop_backup()  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
Josh Berkus <josh@agliodbs.com> wrote:

> Currently, if archive_command is failing, pg_stop_backup() will hang
> forever.  The only way to figure out what's wrong with pg_stop_backup()
> is to tail the PostgreSQL logs.  This is difficult for users to
> troubleshoot, and strongly resists any kind of automation.

That is bad.

> Yes, we can work around this by setting statement_timeout, but that has
> two issues (a) the user has to remember to do it before the problem
> occurs, and (b) it won't differentiate between archive failure and other
> reasons it might time out.

Clearly not a long-term solution.

> As such, I propose that pg_stop_backup() should error with an
> appropriate error message ("Could not archive WAL segments") after
> three
> archiving attempts.  We could also add an optional parameter to raise
> the number of attempts from the default of three.

That sounds sane to me.

> An alternative, if we were doing this from scratch, would be for
> pg_stop_backup to return false or -1 or something if it couldn't
> archive; there are reasons why a user might not care that
> archive_command was failing (shared storage comes to mind).  However,
> that would be a surprising break with backwards compatability, since
> currently users don't check the result value of pg_stop_backup().

Some might, which is a stronger argument against changing what get
returned.  Even in a green field though, I would argue that
pg_stop_backup() should return information about the minimum range
of WAL files needed to perform a consistent recovery -- or possibly
duplicate everything in the backup history file.  An error seems
much more appropriate to indicate that the user does not have a
valid backup.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Emre Hasegeli
Date:
Subject: Re: KNN-GiST with recheck
Next
From: Gavin Flower
Date:
Subject: Re: Proposed changing the definition of decade for date_trunc and extract