Thread: Remove Deprecated Exclusive Backup Mode

Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
Hackers,

I propose we remove the deprecated exclusive backup mode of 
pg_start_backup()/pg_stop_backup() for Postgres 12.

The exclusive backup mode has a serious known issue.  If Postgres 
terminates ungracefully during a backup (due to hardware, kernel, 
Postgres issues, etc.) then Postgres may refuse to restart.

The reason is that the backup_label file will usually reference a 
checkpoint LSN that is older than the WAL available in pg_wal.  Postgres 
does emit a helpful error message while PANIC'ing but that's cold 
comfort to an admin who must manually intervene to get their cluster 
operational again.

The deprecated exclusive mode promises to make a difficult problem 
simple but doesn't live up to that promise. That's why it was replaced 
externally in 9.6 and why pg_basebackup has not used exclusive backups 
since it was introduced in 9.2.

Non-exclusive backups have been available since 9.6 and several 
third-party solutions support this mode, in addition to pg_basebackup.

The recently introduced recovery changes will break current automated 
solutions so this seems like a good opportunity to make improvements on 
the backup side as well.

I'll submit a patch for the 2019-01 commitfest.

Regards,
-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Michael Paquier
Date:
On Mon, Nov 26, 2018 at 10:13:34PM -0500, David Steele wrote:
> Non-exclusive backups have been available since 9.6 and several third-party
> solutions support this mode, in addition to pg_basebackup.

I think that two releases is not actually that much time to just nuke
the exclusive backup interface.  I would be fine if the docs show the
deprecation more aggressively using a warning section, and we could add
an explicit WARNING message about the matter directly when calling
pg_start_backup and pg_stop_backup.

My 2c.
--
Michael

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Andres Freund
Date:
Hi,

On 2018-11-27 12:20:13 +0900, Michael Paquier wrote:
> On Mon, Nov 26, 2018 at 10:13:34PM -0500, David Steele wrote:
> > Non-exclusive backups have been available since 9.6 and several third-party
> > solutions support this mode, in addition to pg_basebackup.
> 
> I think that two releases is not actually that much time to just nuke
> the exclusive backup interface.  I would be fine if the docs show the
> deprecation more aggressively using a warning section, and we could add
> an explicit WARNING message about the matter directly when calling
> pg_start_backup and pg_stop_backup.

That was my gut reaction as well, but I think David's argument about
existing scripts / workflows being broken due the the recovery.conf
is a good reason to be more aggressive here.

Greetings,

Andres Freund


Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Mon, Nov 26, 2018 at 10:13 PM David Steele <david@pgmasters.net> wrote:
> I propose we remove the deprecated exclusive backup mode of
> pg_start_backup()/pg_stop_backup() for Postgres 12.

-1. I don't have a problem with deprecating exclusive backup mode
eventually, but I don't see any good reason to do it this soon.

It's not like the problems with exclusive backup are so serious that
you can't work around them.  If you know which machine is your master,
you can arrange to remove backup_label on reboot (only) on the master
(only). Sure, a lot of people probably do this wrong, but it's
infeasible to disallow all the things that people might use
incorrectly without making the system useless.

There must be hundreds or thousands of backup scripts written by
individual users that still use exclusive mode floating around out
there.  Forcing all of those to be updated or scrapped will annoy
users to no benefit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:
On Tue, Nov 27, 2018 at 4:46 AM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2018-11-27 12:20:13 +0900, Michael Paquier wrote:
> On Mon, Nov 26, 2018 at 10:13:34PM -0500, David Steele wrote:
> > Non-exclusive backups have been available since 9.6 and several third-party
> > solutions support this mode, in addition to pg_basebackup.
>
> I think that two releases is not actually that much time to just nuke
> the exclusive backup interface.  I would be fine if the docs show the
> deprecation more aggressively using a warning section, and we could add
> an explicit WARNING message about the matter directly when calling
> pg_start_backup and pg_stop_backup.

That was my gut reaction as well, but I think David's argument about
existing scripts / workflows being broken due the the recovery.conf
is a good reason to be more aggressive here.


Yeah, I'm in the same boat here -- it feels like it's a bit too short, but since we're breaking things for people *anyway*, it's probably better to break both at once than to have all those people have their things broken multiple times.

--

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 11/26/18 11:04 PM, Robert Haas wrote:
> On Mon, Nov 26, 2018 at 10:13 PM David Steele <david@pgmasters.net> wrote:
>> I propose we remove the deprecated exclusive backup mode of
>> pg_start_backup()/pg_stop_backup() for Postgres 12.
> 
> -1. I don't have a problem with deprecating exclusive backup mode
> eventually, but I don't see any good reason to do it this soon.
> 
> It's not like the problems with exclusive backup are so serious that
> you can't work around them.  If you know which machine is your master,
> you can arrange to remove backup_label on reboot (only) on the master
> (only). 

I do not recommend that users write automated scripts to delete files
out of $PGDATA.  Accidents happen that way.

If this is our recommendation on how the mitigate the problem with
exclusive backup then I think it proves the point that it is broken and
should be removed.

> Sure, a lot of people probably do this wrong, but it's
> infeasible to disallow all the things that people might use
> incorrectly without making the system useless.

But we have already fixed this problem -- it only remains to remove the
old method.

> There must be hundreds or thousands of backup scripts written by
> individual users that still use exclusive mode floating around out
> there.  Forcing all of those to be updated or scrapped will annoy
> users to no benefit.

But we are OK with forcing them to scrap their recovery scripts?

And there is a benefit -- Postgres will be able to restart if it crashes
during a backup.

Regards,
-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Peter Eisentraut
Date:
On 27/11/2018 04:46, Andres Freund wrote:
> That was my gut reaction as well, but I think David's argument about
> existing scripts / workflows being broken due the the recovery.conf
> is a good reason to be more aggressive here.

But backup scripts are not affected by the recovery.conf changes.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Remove Deprecated Exclusive Backup Mode

From
Michael Paquier
Date:
On Tue, Nov 27, 2018 at 02:21:49PM +0100, Peter Eisentraut wrote:
> On 27/11/2018 04:46, Andres Freund wrote:
>> That was my gut reaction as well, but I think David's argument about
>> existing scripts / workflows being broken due the the recovery.conf
>> is a good reason to be more aggressive here.
>
> But backup scripts are not affected by the recovery.conf changes.

In any of my own backup scripts (yeah!), I don't have any dependency to
that either.  Or perhaps pgBackRest has a dependency in this area?
--
Michael

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Simon Riggs
Date:
On Tue, 27 Nov 2018 at 03:13, David Steele <david@pgmasters.net> wrote:
 
The deprecated exclusive mode promises to make a difficult problem
simple but doesn't live up to that promise. That's why it was replaced
externally in 9.6 and why pg_basebackup has not used exclusive backups
since it was introduced in 9.2.

Non-exclusive backups have been available since 9.6 and several
third-party solutions support this mode, in addition to pg_basebackup.

The recently introduced recovery changes will break current automated
solutions so this seems like a good opportunity to make improvements on
the backup side as well.

I'll submit a patch for the 2019-01 commitfest.

-1 for removal, in this release

It's not there because anyone likes it, it's there because removing it is a risk

Recent changes are the reason to keep it, not a reason to remove.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Michael Paquier (michael@paquier.xyz) wrote:
> On Tue, Nov 27, 2018 at 02:21:49PM +0100, Peter Eisentraut wrote:
> > On 27/11/2018 04:46, Andres Freund wrote:
> >> That was my gut reaction as well, but I think David's argument about
> >> existing scripts / workflows being broken due the the recovery.conf
> >> is a good reason to be more aggressive here.
> >
> > But backup scripts are not affected by the recovery.conf changes.
>
> In any of my own backup scripts (yeah!), I don't have any dependency to
> that either.  Or perhaps pgBackRest has a dependency in this area?

If you don't consider your recovery scripts and your backup scripts to
be related then I've really got to wonder how you're regularly testing
your backups to make sure that they're actually valid.

If you aren't regularly testing your backups then I've got little
sympathy.

To be clear, pgbackrest doesn't have any dependency here- but it, like
all of the other 3rd party backup solutions and any restore solution
that a user has come up with, are going to have to be changed to deal
with the changes in how recovery works, so this is a good time to make
these changes.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> There must be hundreds or thousands of backup scripts written by
> individual users that still use exclusive mode floating around out
> there.  Forcing all of those to be updated or scrapped will annoy
> users to no benefit.

How about automated recovery scripts?

We've decided it's fine to break all of those which exist out there.

I'm concerned, seriously, that people don't have anywhere near the
concern about the recovery side of things as they do about the backup
side of things and that's really concerning.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Peter Eisentraut
Date:
On 27/11/2018 15:45, Stephen Frost wrote:
>>> But backup scripts are not affected by the recovery.conf changes.
>> In any of my own backup scripts (yeah!), I don't have any dependency to
>> that either.  Or perhaps pgBackRest has a dependency in this area?
> If you don't consider your recovery scripts and your backup scripts to
> be related then I've really got to wonder how you're regularly testing
> your backups to make sure that they're actually valid.

The sort of installations that continue to use the exclusive backup mode
probably have the following tooling: a 20-line shell script to make the
backup and either a 10-line shell script or a similarly sized README or
wiki page to do the recovery.  Changing the latter for the recovery.conf
changes is probably a 3-line change.  Changing the former for the
removal of exclusive backups would require major changes.  (Try writing
a shell script that keeps a psql session open while it takes the backup
from the file system.  It's possible, but it requires significantly more
logic.)

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> On 27/11/2018 15:45, Stephen Frost wrote:
> >>> But backup scripts are not affected by the recovery.conf changes.
> >> In any of my own backup scripts (yeah!), I don't have any dependency to
> >> that either.  Or perhaps pgBackRest has a dependency in this area?
> > If you don't consider your recovery scripts and your backup scripts to
> > be related then I've really got to wonder how you're regularly testing
> > your backups to make sure that they're actually valid.
>
> The sort of installations that continue to use the exclusive backup mode
> probably have the following tooling: a 20-line shell script to make the
> backup and either a 10-line shell script or a similarly sized README or
> wiki page to do the recovery.  Changing the latter for the recovery.conf
> changes is probably a 3-line change.  Changing the former for the
> removal of exclusive backups would require major changes.  (Try writing
> a shell script that keeps a psql session open while it takes the backup
> from the file system.  It's possible, but it requires significantly more
> logic.)

They're also the sort of installations which don't have reliable backups
and don't have any clue about the danger they are in due to the current
bug/issue/whatever we have with exclusive backups.

No, I don't agree that it's sensible to continue to march on as if
nothing is wrong.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Andreas Karlsson
Date:
On 11/27/18 3:46 PM, Stephen Frost wrote:
> I'm concerned, seriously, that people don't have anywhere near the
> concern about the recovery side of things as they do about the backup
> side of things and that's really concerning.

I agree with your larger point, but in this case the two breakages do 
not seem equal. As far as I gather the removal of recovery.conf will in 
practice result in a longer downtime when your recovery scripts breaks 
but not any data loss. While if we remove the exclusive backup mode then 
some people's backup scripts will break and if they do not properly 
monitor their backups they risk data loss.

I think we should use more caution when data loss is at stake rather 
than "just" downtime. So personally I am in favor of updating the manual 
with warnings (right now it does not even say if exclusive or 
non-exclusive is the default) and adding a deprecation warning when 
people use the exclusive mode.

Best regards,
Andreas


Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Andreas Karlsson (andreas@proxel.se) wrote:
> On 11/27/18 3:46 PM, Stephen Frost wrote:
> >I'm concerned, seriously, that people don't have anywhere near the
> >concern about the recovery side of things as they do about the backup
> >side of things and that's really concerning.
>
> I agree with your larger point, but in this case the two breakages do not
> seem equal. As far as I gather the removal of recovery.conf will in practice
> result in a longer downtime when your recovery scripts breaks but not any
> data loss. While if we remove the exclusive backup mode then some people's
> backup scripts will break and if they do not properly monitor their backups
> they risk data loss.

Let's please try to remember that this is across a major version upgrade
and is removing a broken capability that's already been deprecated for a
couple years.

If they aren't monitoring their backup scripts today, and aren't
regularly performing restores of those backups, they're already risking
data loss.

> I think we should use more caution when data loss is at stake rather than
> "just" downtime. So personally I am in favor of updating the manual with
> warnings (right now it does not even say if exclusive or non-exclusive is
> the default) and adding a deprecation warning when people use the exclusive
> mode.

Waiting isn't going to change any of these factors, nor will throwing
warnings about the exclusive mode if people aren't monitoring their
scripts.

If we really want to keep the exclusive backup mode around, then, as
Magnus said on a nearby thread, it needs to be fixed.  If it's fixed and
just works and everyone's scripts work, then great, then we can
un-deprecate it and move on.

If we aren't going to fix it then let's remove it.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Andreas Karlsson
Date:
On 11/27/18 4:11 PM, Stephen Frost wrote:
>> I agree with your larger point, but in this case the two breakages do not
>> seem equal. As far as I gather the removal of recovery.conf will in practice
>> result in a longer downtime when your recovery scripts breaks but not any
>> data loss. While if we remove the exclusive backup mode then some people's
>> backup scripts will break and if they do not properly monitor their backups
>> they risk data loss.
> 
> Let's please try to remember that this is across a major version upgrade
> and is removing a broken capability that's already been deprecated for a
> couple years.

I know that and you know that, but even our own manual use the exclusive 
mode in some of the examples, e.g: "pg_start_backup('label_goes_here')"[1].

Your point about that people who do not monitor their backups wont 
notice warnings is fair, but even so I think we should give people more 
time to migrate when even our own documentation isn't always clear about 
the exclusive mode being deprecated.

1. 
https://www.postgresql.org/docs/11/functions-admin.html#FUNCTIONS-ADMIN-BACKUP

Andreas


Re: Remove Deprecated Exclusive Backup Mode

From
Peter Eisentraut
Date:
On 27/11/2018 16:02, Stephen Frost wrote:
> They're also the sort of installations which don't have reliable backups
> and don't have any clue about the danger they are in due to the current
> bug/issue/whatever we have with exclusive backups.
> 
> No, I don't agree that it's sensible to continue to march on as if
> nothing is wrong.

It's fair to argue that this facility is broken and needs to be removed.
 But that needs to be its own argument.

The argument in this subthread is that since we're already making
changes in an adjacent area, removing this feature will have a low impact.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 11/27/18 9:54 AM, Peter Eisentraut wrote:
> On 27/11/2018 15:45, Stephen Frost wrote:
>>>> But backup scripts are not affected by the recovery.conf changes.
>>> In any of my own backup scripts (yeah!), I don't have any dependency to
>>> that either.  Or perhaps pgBackRest has a dependency in this area?
>> If you don't consider your recovery scripts and your backup scripts to
>> be related then I've really got to wonder how you're regularly testing
>> your backups to make sure that they're actually valid.
> 
> The sort of installations that continue to use the exclusive backup mode
> probably have the following tooling: a 20-line shell script to make the
> backup and either a 10-line shell script or a similarly sized README or
> wiki page to do the recovery.  Changing the latter for the recovery.conf
> changes is probably a 3-line change.  

Really?  We have gone from a file that can be safely overwritten
(recovery.conf) to a file that might need to be parsed to figure out if
the settings already exists (postgresql.auto.conf).  Of course, you can
just continue to append to the file but that seems pretty grotty to me.

This will also require changes to all HA solutions or backup solutions
that deal with recovery.  I think you underestimate how big a change
this is.  I've been thinking about how to write code for it and it is
significantly more complex -- also more flexible so I'm happy about that.

> Changing the former for the
> removal of exclusive backups would require major changes.  (Try writing
> a shell script that keeps a psql session open while it takes the backup
> from the file system.  It's possible, but it requires significantly more
> logic.)

We provide pg_basebackup with core and it is a solid tool for doing
backups.  Do we really want to encourage the use of bash scripts to do
what we already have a tool for?  If users want to do something more
complex than pg_basebackup then bash is certainly not the tool for that.

Regards,
-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 11/27/18 10:29 AM, Peter Eisentraut wrote:
> On 27/11/2018 16:02, Stephen Frost wrote:
>> They're also the sort of installations which don't have reliable backups
>> and don't have any clue about the danger they are in due to the current
>> bug/issue/whatever we have with exclusive backups.
>>
>> No, I don't agree that it's sensible to continue to march on as if
>> nothing is wrong.
> 
> It's fair to argue that this facility is broken and needs to be removed.
>  But that needs to be its own argument.

I believe I made this argument in the OP.

> The argument in this subthread is that since we're already making
> changes in an adjacent area, removing this feature will have a low impact.

Fair enough, but I think the argument against exclusive backups stands
on its own.  Also, I don't see backup and restore as adjacent.  I see
them as thoroughly intertwined.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 11/27/18 8:56 AM, Simon Riggs wrote:
> On Tue, 27 Nov 2018 at 03:13, David Steele <david@pgmasters.net
> <mailto:david@pgmasters.net>> wrote:
>  
> 
>     The deprecated exclusive mode promises to make a difficult problem
>     simple but doesn't live up to that promise. That's why it was replaced
>     externally in 9.6 and why pg_basebackup has not used exclusive backups
>     since it was introduced in 9.2.
> 
>     Non-exclusive backups have been available since 9.6 and several
>     third-party solutions support this mode, in addition to pg_basebackup.
> 
>     The recently introduced recovery changes will break current automated
>     solutions so this seems like a good opportunity to make improvements on
>     the backup side as well.
> 
>     I'll submit a patch for the 2019-01 commitfest.
> 
> 
> -1 for removal, in this release
> 
> It's not there because anyone likes it, it's there because removing it
> is a risk
> 
> Recent changes are the reason to keep it, not a reason to remove.

How so?

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* David Steele (david@pgmasters.net) wrote:
> On 11/27/18 10:29 AM, Peter Eisentraut wrote:
> > On 27/11/2018 16:02, Stephen Frost wrote:
> >> They're also the sort of installations which don't have reliable backups
> >> and don't have any clue about the danger they are in due to the current
> >> bug/issue/whatever we have with exclusive backups.
> >>
> >> No, I don't agree that it's sensible to continue to march on as if
> >> nothing is wrong.
> >
> > It's fair to argue that this facility is broken and needs to be removed.
> >  But that needs to be its own argument.

We made that argument, and had agreement on it, and that's why it's
deprecated today.  The only question here is how long we keep this
deprecated and broken capability around and given that we're completely
breaking everything around recovery, now is as good a time as any since
users should be looking at how they do backup and restore when they
migrate to v12 due to these changes.

> > The argument in this subthread is that since we're already making
> > changes in an adjacent area, removing this feature will have a low impact.
>
> Fair enough, but I think the argument against exclusive backups stands
> on its own.  Also, I don't see backup and restore as adjacent.  I see
> them as thoroughly intertwined.

Agreed.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Simon Riggs
Date:
On Tue, 27 Nov 2018 at 14:45, Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

* Michael Paquier (michael@paquier.xyz) wrote:
> On Tue, Nov 27, 2018 at 02:21:49PM +0100, Peter Eisentraut wrote:
> > On 27/11/2018 04:46, Andres Freund wrote:
> >> That was my gut reaction as well, but I think David's argument about
> >> existing scripts / workflows being broken due the the recovery.conf
> >> is a good reason to be more aggressive here.
> >
> > But backup scripts are not affected by the recovery.conf changes.
>
> In any of my own backup scripts (yeah!), I don't have any dependency to
> that either.  Or perhaps pgBackRest has a dependency in this area?

If you don't consider your recovery scripts and your backup scripts to
be related then I've really got to wonder how you're regularly testing
your backups to make sure that they're actually valid.

If you aren't regularly testing your backups then I've got little
sympathy.

To be clear, pgbackrest doesn't have any dependency here- but it, like
all of the other 3rd party backup solutions and any restore solution
that a user has come up with, are going to have to be changed to deal
with the changes in how recovery works, so this is a good time to make
these changes.

If those tools are updated and the changes all work, that will be good.

That isn't the same thing as an argument to remove things in this release.

I propose waiting until next release. 

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Andreas Karlsson (andreas@proxel.se) wrote:
> On 11/27/18 4:11 PM, Stephen Frost wrote:
> >>I agree with your larger point, but in this case the two breakages do not
> >>seem equal. As far as I gather the removal of recovery.conf will in practice
> >>result in a longer downtime when your recovery scripts breaks but not any
> >>data loss. While if we remove the exclusive backup mode then some people's
> >>backup scripts will break and if they do not properly monitor their backups
> >>they risk data loss.
> >
> >Let's please try to remember that this is across a major version upgrade
> >and is removing a broken capability that's already been deprecated for a
> >couple years.
>
> I know that and you know that, but even our own manual use the exclusive
> mode in some of the examples, e.g: "pg_start_backup('label_goes_here')"[1].

Thankfully, that'll get nicely cleaned up by removing the exclusive
backup mode.

> Your point about that people who do not monitor their backups wont notice
> warnings is fair, but even so I think we should give people more time to
> migrate when even our own documentation isn't always clear about the
> exclusive mode being deprecated.

I don't buy off on this argument- we also have things like this:

https://www.postgresql.org/docs/11/backup-file.html

Where we show a simple tar command as a way to backup the database, but
then we caveat it.  Similairly, we make it clear that users shouldn't be
using the exclusive mode backups:

https://www.postgresql.org/docs/11/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP

If users are only reading the system functions description and thinking
that's enough to figure out how to do backups, and didn't read the
chapters of documentation we have about how to perform a backup, then
chances are very good their existing backup systems are broken, and
that's all the more reason to remove the exclusive mode because at least
then it won't look like things are working when they aren't.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Andres Freund
Date:
Hi,

On 2018-11-26 23:04:35 -0500, Robert Haas wrote:
> It's not like the problems with exclusive backup are so serious that
> you can't work around them.  If you know which machine is your master,
> you can arrange to remove backup_label on reboot (only) on the master
> (only). Sure, a lot of people probably do this wrong, but it's
> infeasible to disallow all the things that people might use
> incorrectly without making the system useless.

That doesn't protect against postgres, not machine, [crash-]restarts
though.

Greetings,

Andres Freund


Re: Remove Deprecated Exclusive Backup Mode

From
Michael Paquier
Date:
On Tue, Nov 27, 2018 at 09:45:04AM -0500, Stephen Frost wrote:
> If you don't consider your recovery scripts and your backup scripts to
> be related then I've really got to wonder how you're regularly testing
> your backups to make sure that they're actually valid.

Base backups can be perfectly self-contained as long as they include all
the WAL segments needed to recover up to the end-of-backup record.
That's what pg_basebackup does with its default options
(--wal-method=stream in particular).

> If you aren't regularly testing your backups then I've got little
> sympathy.

Fortunately they do, hundreds of time on a daily basis ;)

> To be clear, pgbackrest doesn't have any dependency here- but it, like
> all of the other 3rd party backup solutions and any restore solution
> that a user has come up with, are going to have to be changed to deal
> with the changes in how recovery works, so this is a good time to make
> these changes.

My point is that base backups do not have a mandatory dependency with
recovery.conf all the time as they can perfectly be restored if they are
standalone backups.  I can see a dependency with recovery.conf once you
have a base backup which needs to be fed with WAL segments from an
external archive, or when using a base backup to create a standby.
--
Michael

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 11/27/18 4:25 PM, Michael Paquier wrote:
> On Tue, Nov 27, 2018 at 09:45:04AM -0500, Stephen Frost wrote:
>> If you don't consider your recovery scripts and your backup scripts to
>> be related then I've really got to wonder how you're regularly testing
>> your backups to make sure that they're actually valid.
>
> Base backups can be perfectly self-contained as long as they include all
> the WAL segments needed to recover up to the end-of-backup record.
> That's what pg_basebackup does with its default options
> (--wal-method=stream in particular).

This is true, of course, but it misses one of the major benefits of
file-level backups which is PITR.

>> If you aren't regularly testing your backups then I've got little
>> sympathy.
>
> Fortunately they do, hundreds of time on a daily basis ;)

Nice!

>> To be clear, pgbackrest doesn't have any dependency here- but it, like
>> all of the other 3rd party backup solutions and any restore solution
>> that a user has come up with, are going to have to be changed to deal
>> with the changes in how recovery works, so this is a good time to make
>> these changes.
>
> My point is that base backups do not have a mandatory dependency with
> recovery.conf all the time as they can perfectly be restored if they are
> standalone backups.  I can see a dependency with recovery.conf once you
> have a base backup which needs to be fed with WAL segments from an
> external archive, or when using a base backup to create a standby.

If you want to do PITR -- which is the default in most situations --
then some interaction with recovery.conf is needed.  I think you would
be hard-pressed to find a prominent HA or backup solution that doesn't
do so.

--
-David
david@pgmasters.net


Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Michael Paquier (michael@paquier.xyz) wrote:
> On Tue, Nov 27, 2018 at 09:45:04AM -0500, Stephen Frost wrote:
> > If you don't consider your recovery scripts and your backup scripts to
> > be related then I've really got to wonder how you're regularly testing
> > your backups to make sure that they're actually valid.
>
> Base backups can be perfectly self-contained as long as they include all
> the WAL segments needed to recover up to the end-of-backup record.
> That's what pg_basebackup does with its default options
> (--wal-method=stream in particular).

This really doesn't change my opinion at all.  Sure, some custom scripts
might have been written to operate in this way and now they'll need to
be adjusted to work the exact same way pg_basebackup works today and use
the non-exclusive mode, but they'll be better off for it and won't have
the concern about what happens if the system is inadvertantly restarted
during a backup.

I'd also say it's likely that places which know enough to build such a
solution aren't ones that we really need to worry about having an issue
adjusting their scripts.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 11/26/18 10:13 PM, David Steele wrote:
> Hackers,
> 
> I propose we remove the deprecated exclusive backup mode of
> pg_start_backup()/pg_stop_backup() for Postgres 12.
> 
> The exclusive backup mode has a serious known issue.  If Postgres
> terminates ungracefully during a backup (due to hardware, kernel,
> Postgres issues, etc.) then Postgres may refuse to restart.
> 
> The reason is that the backup_label file will usually reference a
> checkpoint LSN that is older than the WAL available in pg_wal.  Postgres
> does emit a helpful error message while PANIC'ing but that's cold
> comfort to an admin who must manually intervene to get their cluster
> operational again.
> 
> The deprecated exclusive mode promises to make a difficult problem
> simple but doesn't live up to that promise. That's why it was replaced
> externally in 9.6 and why pg_basebackup has not used exclusive backups
> since it was introduced in 9.2.
> 
> Non-exclusive backups have been available since 9.6 and several
> third-party solutions support this mode, in addition to pg_basebackup.
> 
> The recently introduced recovery changes will break current automated
> solutions so this seems like a good opportunity to make improvements on
> the backup side as well.
> 
> I'll submit a patch for the 2019-01 commitfest.

Attached is the patch.

I was a bit surprised by how much code went away.  There was a lot of
code dedicated to making sure that backup_label got renamed on shutdown,
that there was not an exclusive backup running, etc.

There were no tests for exclusive backup so the test changes were minimal.

I did have to replace one "hot backup" in
010_logical_decoding_timelines.pl.  I'm not sure why the backup was
being done this way, or why the standby needs a copy of pg_replslot
(which is not copied by pg_basebackup).  It might be worth looking at
but it seemed out of scope for this patch.

Regards,
-- 
-David
david@pgmasters.net

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Wed, Dec 12, 2018 at 7:24 AM David Steele <david@pgmasters.net> wrote:
> Attached is the patch.
>
> I was a bit surprised by how much code went away.  There was a lot of
> code dedicated to making sure that backup_label got renamed on shutdown,
> that there was not an exclusive backup running, etc.
>
> There were no tests for exclusive backup so the test changes were minimal.
>
> I did have to replace one "hot backup" in
> 010_logical_decoding_timelines.pl.  I'm not sure why the backup was
> being done this way, or why the standby needs a copy of pg_replslot
> (which is not copied by pg_basebackup).  It might be worth looking at
> but it seemed out of scope for this patch.

I wish to point out that there currently seem to be more votes against
this proposal than for it, and that nobody should commit a patch
unless there is a consensus that it should be committed, whether or
not the committer personally agrees with the arguments against it.

As for my vote, I do not buy the idea that because we're changing some
stuff about recovery.conf we should go ahead and do this too.  Yes,
they are related, but just because you adjust your backup/restore
script/tool to cope with one change doesn't mean that you don't have
to adjust it some more to cope with the other change.

I also think that the idea that supporting the exclusive backup
interface is hurting anything is greatly exaggerated.  Whether we keep
it or not, we're not forcing anyone to use it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Michael Paquier
Date:
On Wed, Dec 12, 2018 at 11:05:36AM +0900, Robert Haas wrote:
> I wish to point out that there currently seem to be more votes against
> this proposal than for it, and that nobody should commit a patch
> unless there is a consensus that it should be committed, whether or
> not the committer personally agrees with the arguments against it.
>
> As for my vote, I do not buy the idea that because we're changing some
> stuff about recovery.conf we should go ahead and do this too.  Yes,
> they are related, but just because you adjust your backup/restore
> script/tool to cope with one change doesn't mean that you don't have
> to adjust it some more to cope with the other change.
>
> I also think that the idea that supporting the exclusive backup
> interface is hurting anything is greatly exaggerated.  Whether we keep
> it or not, we're not forcing anyone to use it.

+1 on all that.  Per the trend of this thread, I see a bunch of
committers and contributors commenting about *not* removing this code,
so sending a patch to actually remove it looks like a throw into an
abysmal void.
--
Michael

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Andres Freund
Date:
Hi,

On 2018-12-12 11:05:36 +0900, Robert Haas wrote:
> As for my vote, I do not buy the idea that because we're changing some
> stuff about recovery.conf we should go ahead and do this too.  Yes,
> they are related, but just because you adjust your backup/restore
> script/tool to cope with one change doesn't mean that you don't have
> to adjust it some more to cope with the other change.

Sure, but you have to touch it, and it's not that big of an additional
change.  Call the other form of pg_stop_backup(), and write the contents
of the two returned textblobs into backup_label/tablespace_map.  As a
reward you get a backup that break crash-safety for the database, and
not having to do the migration later.


> I also think that the idea that supporting the exclusive backup
> interface is hurting anything is greatly exaggerated.  Whether we keep
> it or not, we're not forcing anyone to use it.

It's hurting users.  I've seen numerous outages due to the backup_label
file being around in the wrong moment.

Greetings,

Andres Freund


Re: Remove Deprecated Exclusive Backup Mode

From
Andres Freund
Date:
On 2018-12-12 11:32:55 +0900, Michael Paquier wrote:
> On Wed, Dec 12, 2018 at 11:05:36AM +0900, Robert Haas wrote:
> > I wish to point out that there currently seem to be more votes against
> > this proposal than for it, and that nobody should commit a patch
> > unless there is a consensus that it should be committed, whether or
> > not the committer personally agrees with the arguments against it.
> > 
> > As for my vote, I do not buy the idea that because we're changing some
> > stuff about recovery.conf we should go ahead and do this too.  Yes,
> > they are related, but just because you adjust your backup/restore
> > script/tool to cope with one change doesn't mean that you don't have
> > to adjust it some more to cope with the other change.
> > 
> > I also think that the idea that supporting the exclusive backup
> > interface is hurting anything is greatly exaggerated.  Whether we keep
> > it or not, we're not forcing anyone to use it.
> 
> +1 on all that.  Per the trend of this thread, I see a bunch of
> committers and contributors commenting about *not* removing this code,
> so sending a patch to actually remove it looks like a throw into an
> abysmal void.

I don't think it's as clear as you make it out to be:

In favor:
- David Steele
- Andres Freund
- Stephen Frost, I think
Against:
- Simon Riggs
- Robert Haas
- Michael Paquier
- Andreas Karlsson
- Peter Eisentraut, I think

While that's clearly lopsided, I still think it's pretty darn absurd to
call such a situation "a throw into an abysmal void".

Greetings,

Andres Freund


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 12/11/18 9:05 PM, Robert Haas wrote:
> On Wed, Dec 12, 2018 at 7:24 AM David Steele <david@pgmasters.net> wrote:
>> Attached is the patch.
> 
> As for my vote, I do not buy the idea that because we're changing some
> stuff about recovery.conf we should go ahead and do this too.  Yes,
> they are related, but just because you adjust your backup/restore
> script/tool to cope with one change doesn't mean that you don't have
> to adjust it some more to cope with the other change.

I think the patch stands on its own.  Exclusive backup mode is broken
and is know to cause problems in the field.

We deprecated it three years ago and we have a very capable core backup
tool that people can use instead.

> I also think that the idea that supporting the exclusive backup
> interface is hurting anything is greatly exaggerated.  Whether we keep
> it or not, we're not forcing anyone to use it.

There are maintenance advantages to getting rid of it.  The code has
become simpler and the documentation has become much simpler.  Trying to
explain the subtle differences between backup methods is difficult and
only makes the documentation more confusing.

Removing the extra complexity from shutdown also seems like a win.  The
gyrations we went through to try and ensure that backup_label was
renamed before shutdown were truly agonizing.  None of that is necessary
with the non-exclusive mode.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:
On Wed, Dec 12, 2018 at 11:44 AM Andres Freund <andres@anarazel.de> wrote:
On 2018-12-12 11:32:55 +0900, Michael Paquier wrote:
> On Wed, Dec 12, 2018 at 11:05:36AM +0900, Robert Haas wrote:
> > I wish to point out that there currently seem to be more votes against
> > this proposal than for it, and that nobody should commit a patch
> > unless there is a consensus that it should be committed, whether or
> > not the committer personally agrees with the arguments against it.
> >
> > As for my vote, I do not buy the idea that because we're changing some
> > stuff about recovery.conf we should go ahead and do this too.  Yes,
> > they are related, but just because you adjust your backup/restore
> > script/tool to cope with one change doesn't mean that you don't have
> > to adjust it some more to cope with the other change.
> >
> > I also think that the idea that supporting the exclusive backup
> > interface is hurting anything is greatly exaggerated.  Whether we keep
> > it or not, we're not forcing anyone to use it.
>
> +1 on all that.  Per the trend of this thread, I see a bunch of
> committers and contributors commenting about *not* removing this code,
> so sending a patch to actually remove it looks like a throw into an
> abysmal void.

I don't think it's as clear as you make it out to be:

In favor:
- David Steele
- Andres Freund
- Stephen Frost, I think

You forgot me, also in this camp :P

 
Against:
- Simon Riggs
- Robert Haas
- Michael Paquier
- Andreas Karlsson
- Peter Eisentraut, I think

While that's clearly lopsided, I still think it's pretty darn absurd to
call such a situation "a throw into an abysmal void".

I'd say it's close enough to an even split to say it doesn't go either way. But we can clearly say it's not consensus for removing it (nor for keeping it, but that would be the default..) 

--

Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:
On Wed, Dec 12, 2018 at 1:09 PM David Steele <david@pgmasters.net> wrote:
On 12/11/18 9:05 PM, Robert Haas wrote:
> On Wed, Dec 12, 2018 at 7:24 AM David Steele <david@pgmasters.net> wrote:
>> Attached is the patch.
>
> As for my vote, I do not buy the idea that because we're changing some
> stuff about recovery.conf we should go ahead and do this too.  Yes,
> they are related, but just because you adjust your backup/restore
> script/tool to cope with one change doesn't mean that you don't have
> to adjust it some more to cope with the other change.

I think the patch stands on its own.  Exclusive backup mode is broken
and is know to cause problems in the field.

Yes, this is a *far* too common thing to be seen in the field IMHO.

Is it super-common? No. But it's by far too common not to be a concern.


We deprecated it three years ago and we have a very capable core backup
tool that people can use instead.

+1.

If we can't get consensus on removing it now, then how about actually setting a *plan* for when to remove it?

For example, in 12 we could more aggressively deprecate it (such as actually emitting a WARNING message whenever used?), and then announce already now that it will be removed in version 13? That will then cover all those people who didn't read the documentation and find the existing deprecation notice?


> I also think that the idea that supporting the exclusive backup
> interface is hurting anything is greatly exaggerated.  Whether we keep
> it or not, we're not forcing anyone to use it.

There are maintenance advantages to getting rid of it.  The code has
become simpler and the documentation has become much simpler.  Trying to
explain the subtle differences between backup methods is difficult and
only makes the documentation more confusing.

Removing the extra complexity from shutdown also seems like a win.  The
gyrations we went through to try and ensure that backup_label was
renamed before shutdown were truly agonizing.  None of that is necessary
with the non-exclusive mode.

That, too! 

--

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 11/27/18 10:05 AM, Andreas Karlsson wrote:
> On 11/27/18 3:46 PM, Stephen Frost wrote:
>> I'm concerned, seriously, that people don't have anywhere near the
>> concern about the recovery side of things as they do about the backup
>> side of things and that's really concerning.
> 
> I think we should use more caution when data loss is at stake rather
> than "just" downtime. So personally I am in favor of updating the manual
> with warnings (right now it does not even say if exclusive or
> non-exclusive is the default) and adding a deprecation warning when
> people use the exclusive mode.

The documentation says (since 9.6):

Low level base backups can be made in a non-exclusive or an exclusive
way. The non-exclusive method is recommended and the exclusive one is
deprecated and will eventually be removed.

That seems clear enough to me.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 12/11/18 9:44 PM, Andres Freund wrote:
> On 2018-12-12 11:32:55 +0900, Michael Paquier wrote:
>>
>> +1 on all that.  Per the trend of this thread, I see a bunch of
>> committers and contributors commenting about *not* removing this code,
>> so sending a patch to actually remove it looks like a throw into an
>> abysmal void.
> 
> I don't think it's as clear as you make it out to be:
> 
> In favor:
> - David Steele
> - Andres Freund
> - Stephen Frost, I think
> Against:
> - Simon Riggs
> - Robert Haas
> - Michael Paquier
> - Andreas Karlsson
> - Peter Eisentraut, I think
> 
> While that's clearly lopsided, I still think it's pretty darn absurd to
> call such a situation "a throw into an abysmal void".

I didn't get the impression that Peter was against, he just thought that
it needed to stand on its own, rather than be justified by the
recovery.conf changes, which I agree with.

Simon rather clearly said that he thinks we should wait until the next
release, which I don't see as being entirely against.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Wed, Dec 12, 2018 at 1:23 PM David Steele <david@pgmasters.net> wrote:
> I didn't get the impression that Peter was against, he just thought that
> it needed to stand on its own, rather than be justified by the
> recovery.conf changes, which I agree with.
>
> Simon rather clearly said that he thinks we should wait until the next
> release, which I don't see as being entirely against.

Well, nobody is saying that we should NEVER remove this.  The
discussion is about what to do in v12.

Most of the features I've been involved in removing have been
deprecated for 5+ years.  The first release where this one was
deprecated was only 2 years ago.  So it feels dramatically faster to
me than what I think we have typically done.

Actually, I hadn't realized until this discussion that the exclusive
backup interface was actually deprecated -- I thought we were just
recommending the new non-exclusive backup interface should be used.
If we're in a rush to remove this (and apparently many of us are), I
think we should make that warning a lot more prominent, maybe copy it
into a few more places, and back-patch the changes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Michael Paquier
Date:
On Wed, Dec 12, 2018 at 01:31:53PM +0900, Robert Haas wrote:
> Most of the features I've been involved in removing have been
> deprecated for 5+ years.  The first release where this one was
> deprecated was only 2 years ago.  So it feels dramatically faster to
> me than what I think we have typically done.
>
> Actually, I hadn't realized until this discussion that the exclusive
> backup interface was actually deprecated -- I thought we were just
> recommending the new non-exclusive backup interface should be used.
> If we're in a rush to remove this (and apparently many of us are), I
> think we should make that warning a lot more prominent, maybe copy it
> into a few more places, and back-patch the changes.

Indeed, I definitely agree that we should move away from the exclusive
backup APIs at some point, still it feels very early to do this move as
of now.  Showing a bigger warning in the docs would be nice as a first
step, and then we could wait a couple of extra years before doing the
actual move.

By the way, David and others, my apologies for my latest email, I hope
you did not take that badly, my impression of this thread about the
opinions gathered was wrong.
--
Michael

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Andres Freund
Date:

On December 11, 2018 10:10:00 PM PST, Michael Paquier <michael@paquier.xyz> wrote:
>By the way, David and others, my apologies for my latest email, I hope
>you did not take that badly, my impression of this thread about the
>opinions gathered was wrong.

Even if it had been 10:1 it doesn't seem like it'd have been a good approach of an answer.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: Remove Deprecated Exclusive Backup Mode

From
Laurenz Albe
Date:
On Wed, 2018-12-12 at 13:31 +0900, Robert Haas wrote:
> On Wed, Dec 12, 2018 at 1:23 PM David Steele <david@pgmasters.net> wrote:
> > I didn't get the impression that Peter was against, he just thought that
> > it needed to stand on its own, rather than be justified by the
> > recovery.conf changes, which I agree with.
> > 
> > Simon rather clearly said that he thinks we should wait until the next
> > release, which I don't see as being entirely against.
> 
> Well, nobody is saying that we should NEVER remove this.  The
> discussion is about what to do in v12.
> 
> Most of the features I've been involved in removing have been
> deprecated for 5+ years.  The first release where this one was
> deprecated was only 2 years ago.  So it feels dramatically faster to
> me than what I think we have typically done.
> 
> Actually, I hadn't realized until this discussion that the exclusive
> backup interface was actually deprecated -- I thought we were just
> recommending the new non-exclusive backup interface should be used.
> If we're in a rush to remove this (and apparently many of us are), I
> think we should make that warning a lot more prominent, maybe copy it
> into a few more places, and back-patch the changes.

+1

I too only learned about this recently, while the problem with exclusive
backups has been known at least since 2008 (c979a1fe), and nobody felt
this to be a terrible problem back then.

I believe that the danger is greatly overrated.  It is not like you end
up with a corrupted database after a crash, and you get a pretty helpful
error message.  Many people are happy enough to live with that.

I'm on board with deprecating and removing it eventually, but I see no
problem in waiting for the customary 5 years.
And yes, a prominent warning in the next major release notes would be
a good thing.

Yours,
Laurenz Albe



Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 12/11/18 11:31 PM, Robert Haas wrote:
> On Wed, Dec 12, 2018 at 1:23 PM David Steele <david@pgmasters.net> wrote:
>> I didn't get the impression that Peter was against, he just thought that
>> it needed to stand on its own, rather than be justified by the
>> recovery.conf changes, which I agree with.
>>
>> Simon rather clearly said that he thinks we should wait until the next
>> release, which I don't see as being entirely against.
> 
> Well, nobody is saying that we should NEVER remove this.  The
> discussion is about what to do in v12.

Fair enough.

> Most of the features I've been involved in removing have been
> deprecated for 5+ years.  The first release where this one was
> deprecated was only 2 years ago.  So it feels dramatically faster to
> me than what I think we have typically done.

I think this case is different because exclusive backups have known issues.

I also think that having two similar but different methods stifles
development in this area and makes the docs harder to improve.  There's
a lot of "if this then that else that" in the docs which make them hard
to maintain and harder to read.

Add the fact that we have *zero* tests for exclusive backups.  I only
had to refactor one exclusive backup in the tests and since it did not
archive, exclude pg_wal, postmaster.pid, or do anything else our docs
recommend I wouldn't say it qualifies as a real test.  Also, it wasn't
even trying to test exclusive backups -- it was a test for logical
replication following timelines.

So yeah, we'd be removing it in three years instead of five, but there
has been a safe in-core alternative since 9.1 (eight years).

Regards,
-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Michael Paquier
Date:
On Wed, Dec 12, 2018 at 08:22:10AM -0500, David Steele wrote:
> Add the fact that we have *zero* tests for exclusive backups.  I only
> had to refactor one exclusive backup in the tests and since it did not
> archive, exclude pg_wal, postmaster.pid, or do anything else our docs
> recommend I wouldn't say it qualifies as a real test.  Also, it wasn't
> even trying to test exclusive backups -- it was a test for logical
> replication following timelines.

This point is really right.  The TAP tests rely heavily on pg_basebackup
when taking base backups, still there is an interface to be able to take
filesystem-level backups with the exclusive SQL interface.  The test
David is referring to here is backup_fs_hot in PostgresNode.pm.
--
Michael

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Peter Eisentraut
Date:
On 12/12/2018 05:31, Robert Haas wrote:
> Most of the features I've been involved in removing have been
> deprecated for 5+ years.  The first release where this one was
> deprecated was only 2 years ago.  So it feels dramatically faster to
> me than what I think we have typically done.

I was just looking this up as well, and I find it too fast.  The
nonexclusive backups were introduced in 9.6.  So I'd say that we could
remove the exclusive ones when 9.5 goes EOL.  (That would mean this
patch could be submitted for PostgreSQL 13, since 9.5 will go out of
support around the time PG13 would be released.)

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Remove Deprecated Exclusive Backup Mode

From
Peter Eisentraut
Date:
On 11/12/2018 23:24, David Steele wrote:
> @@ -845,7 +838,7 @@ test ! -f /mnt/server/archivedir/00000001000000A900000065 && cp pg_wal/0
>       rights to run pg_start_backup (superuser, or a user who has been granted
>       EXECUTE on the function) and issue the command:
>  <programlisting>
> -SELECT pg_start_backup('label', false, false);
> +SELECT pg_start_backup('label', false);
>  </programlisting>
>       where <literal>label</literal> is any string you want to use to uniquely
>       identify this backup operation. The connection

Is it good to change the meaning of an existing interface like that?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Remove Deprecated Exclusive Backup Mode

From
Peter Eisentraut
Date:
On 12/12/2018 05:09, David Steele wrote:
> I think the patch stands on its own.  Exclusive backup mode is broken
> and is know to cause problems in the field.

Is this breakage documented anywhere for users?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 12/13/18 2:02 PM, Peter Eisentraut wrote:
> On 12/12/2018 05:09, David Steele wrote:
>> I think the patch stands on its own.  Exclusive backup mode is broken
>> and is know to cause problems in the field.
> 
> Is this breakage documented anywhere for users?

Yes:

https://www.postgresql.org/docs/11/continuous-archiving.html

"Note that if the server crashes during the backup it may not be
possible to restart until the backup_label file has been manually
deleted from the PGDATA directory."

Seems like the wording could be stronger.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 12/13/18 2:00 PM, Peter Eisentraut wrote:
> On 11/12/2018 23:24, David Steele wrote:
>> @@ -845,7 +838,7 @@ test ! -f /mnt/server/archivedir/00000001000000A900000065 && cp pg_wal/0
>>       rights to run pg_start_backup (superuser, or a user who has been granted
>>       EXECUTE on the function) and issue the command:
>>  <programlisting>
>> -SELECT pg_start_backup('label', false, false);
>> +SELECT pg_start_backup('label', false);
>>  </programlisting>
>>       where <literal>label</literal> is any string you want to use to uniquely
>>       identify this backup operation. The connection
> 
> Is it good to change the meaning of an existing interface like that?

Good question.

We could leave the third parameter (changing the default to false) and
error if it has any value but false.  It's a bit ugly but it does
maintain compatibility with the current non-exclusive backup interface.
 And, the third parameter would never need to be specified unless we add
a fourth.

Or we could add a new function and deprecate this one -- but what to
call it?

I agree it's not very cool to break code for users who actually *did*
migrate to non-exclusive backups.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Thu, Dec 13, 2018 at 2:29 PM David Steele <david@pgmasters.net> wrote:
> Good question.
>
> We could leave the third parameter (changing the default to false) and
> error if it has any value but false.  It's a bit ugly but it does
> maintain compatibility with the current non-exclusive backup interface.
>  And, the third parameter would never need to be specified unless we add
> a fourth.
>
> Or we could add a new function and deprecate this one -- but what to
> call it?
>
> I agree it's not very cool to break code for users who actually *did*
> migrate to non-exclusive backups.

Uh, yeah.  If you do that, you're not just forcing people off the old
interface -- you're making *everybody* update their stuff again.  And
having to have conditional logic to handle different releases presents
its own set of dangers. IMHO, the biggest enemy in this area BY FAR is
human error, not the details of how the interface works. No amount of
providing a better-designed interface will compensate for the
confusion factor involved in changing it.

My vote is ... when we actually deprecate this, change nothing at the
SQL level initially, but just throw an error if the "exclusive"
argument isn't false:

ERROR: exclusive backup mode is no longer supported

That will require everyone to use the three-argument form of
pg_start_backup(), but I think that's good, because if we accept the
one and two argument forms, how do we actually know that the user
updated their code?  If you want to actually force people to switch to
the non-exclusive mode, you have to make sure that anything that might
be a residual request for exclusive backup mode errors out.  If the
same SQL just does something different, the user might fail to notice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Peter Eisentraut
Date:
On 13/12/2018 20:17, David Steele wrote:
> On 12/13/18 2:02 PM, Peter Eisentraut wrote:
>> On 12/12/2018 05:09, David Steele wrote:
>>> I think the patch stands on its own.  Exclusive backup mode is broken
>>> and is know to cause problems in the field.
>>
>> Is this breakage documented anywhere for users?
> 
> Yes:
> 
> https://www.postgresql.org/docs/11/continuous-archiving.html
> 
> "Note that if the server crashes during the backup it may not be
> possible to restart until the backup_label file has been manually
> deleted from the PGDATA directory."
> 
> Seems like the wording could be stronger.

I think that is a pretty liberal interpretation of the word "broken".
All this says is "if you do A, and B happens, then you need to do C".

Clearly, not having to do that at all is better, but if this is all
there is to it, then I'm confused by the characterizations of how awful
and terrible this feature is and how we must rush to remove it.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 12/14/18 3:01 AM, Peter Eisentraut wrote:
> On 13/12/2018 20:17, David Steele wrote:
>> On 12/13/18 2:02 PM, Peter Eisentraut wrote:
>>> On 12/12/2018 05:09, David Steele wrote:
>>>> I think the patch stands on its own.  Exclusive backup mode is broken
>>>> and is know to cause problems in the field.
>>>
>>> Is this breakage documented anywhere for users?
>>
>> Yes:
>>
>> https://www.postgresql.org/docs/11/continuous-archiving.html
>>
>> "Note that if the server crashes during the backup it may not be
>> possible to restart until the backup_label file has been manually
>> deleted from the PGDATA directory."
>>
>> Seems like the wording could be stronger.
> 
> I think that is a pretty liberal interpretation of the word "broken".
> All this says is "if you do A, and B happens, then you need to do C".
> 
> Clearly, not having to do that at all is better, but if this is all
> there is to it, then I'm confused by the characterizations of how awful
> and terrible this feature is and how we must rush to remove it.

If the server crashes during a backup, the server probably won't
restart.  We say "may" here but if your backup runs more than a few
checkpoints it's more like won't.  The remedy is to go manually delete a
file the user may have never heard of.  They are often hesitant to do so
-- for good reason.

The downtime stretches as they call their support company or consult
Google to determine if they should really do this.

The main thing is that *manual* intervention is required to get the
cluster running again.  Sure, you could automate it, but now we have
users writing scripts to delete files out of PGDATA (may as well get
tablespace_map as well) -- but hopefully not in the case of a restore
where you actually need those files.  And hopefully not anything else
important.

That seems pretty broken to me.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Fri, Dec 14, 2018 at 8:27 AM David Steele <david@pgmasters.net> wrote:
> If the server crashes during a backup, the server probably won't
> restart.  We say "may" here but if your backup runs more than a few
> checkpoints it's more like won't.  The remedy is to go manually delete a
> file the user may have never heard of.  They are often hesitant to do so
> -- for good reason.

Sure.  So it's broken in the sense that if you don't read and
carefully follow the directions, you will have problems.  But that's
also true of an automobile, a power saw, the DELETE command,
antibiotics, and the Internal Revenue Code.  Many things in life are
complicated and require due caution, and yet people regularly navigate
things that are VASTLY more complicated than "if the server crashes
during a hot backup, remove this file."  Especially because if we
subsequently fail for this reason, we give you a very precise hint
that tells you exactly what to do:

HINT: If you are not restoring from a backup, try removing the file
"<path to $PGDATA goes here>/backup_label"

Now, I grant that emitting a HINT telling you exactly what to do is
not as good as not requiring manual user invention in the first place,
but it's not like we're requiring you to carry out some byzantine
process that nobody can possible understand.  The "if" condition here
is one that any DBA should be able to understand: am I, or am I not,
in the process of restoring a backup?  The system does not know, but
the DBA should.  And once you know that the answer is "no, I'm not
restoring a backup", then you just remove the exact pathname indicated
and try it again and everything works fine.

Peter's complaint -- that this is a pretty liberal reading of the word
"broken" -- is IMHO very well put.  Saying that something is broken
means it doesn't work.  But the exclusive backup mode works fine if
used according to the directions.  The fact that people often get
confused and don't follow the directions, and that the directions tend
to end up requiring manual intervention after a system crash, is
unfortunate, and it is why we have a new API.  But "that thing isn't
as well designed as it could've been" is not the same thing as "that
thing does not work when used as designed," no matter how much you
keep insisting the contrary.  It does work.  It has worked for a long
time.  Many people have used it successfully.  It was the ONLY way of
doing this for many years.  There is no necessary reason why it has to
be a huge problem, and I don't think it is.

In my experience with EnterpriseDB customers, this IS an occasional
source of confusion, but there are other things that are a problem a
lot more frequently, like work_mem, which you often can't set high
enough to get good query performance without causing OOM events when
the system gets busy.  I'd hesitate to call work_mem broken, but a GUC
for which no single value both adequately protects against OOM and
gives workable performance is closer to broken than an exclusive
backup mode, which actually does work - albeit with occasional manual
intervention - if you read and follow the directions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 12/12/18 4:44 AM, Andres Freund wrote:
> On 2018-12-12 11:32:55 +0900, Michael Paquier wrote:
>>
>> +1 on all that.  Per the trend of this thread, I see a bunch of
>> committers and contributors commenting about *not* removing this code,
>> so sending a patch to actually remove it looks like a throw into an
>> abysmal void.
> 
> I don't think it's as clear as you make it out to be:
> 
> In favor:
> - David Steele
> - Andres Freund
> - Stephen Frost, I think
> Against:
> - Simon Riggs
> - Robert Haas
> - Michael Paquier
> - Andreas Karlsson
> - Peter Eisentraut, I think
> 
> While that's clearly lopsided, I still think it's pretty darn absurd to
> call such a situation "a throw into an abysmal void".

The way I read the discussion, there is no consensus for getting this 
into PG12, but there does seem to be one for PG13.

I'll push this to the first post-PG12 CF when one appears.  Can we just 
go ahead and create a 2019-07 CF now?  I know we generally discuss this 
at PGCon, but we can always rename it, right?

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:
On Wed, Dec 19, 2018 at 5:38 PM David Steele <david@pgmasters.net> wrote:
On 12/12/18 4:44 AM, Andres Freund wrote:
> On 2018-12-12 11:32:55 +0900, Michael Paquier wrote:
>>
>> +1 on all that.  Per the trend of this thread, I see a bunch of
>> committers and contributors commenting about *not* removing this code,
>> so sending a patch to actually remove it looks like a throw into an
>> abysmal void.
>
> I don't think it's as clear as you make it out to be:
>
> In favor:
> - David Steele
> - Andres Freund
> - Stephen Frost, I think
> Against:
> - Simon Riggs
> - Robert Haas
> - Michael Paquier
> - Andreas Karlsson
> - Peter Eisentraut, I think
>
> While that's clearly lopsided, I still think it's pretty darn absurd to
> call such a situation "a throw into an abysmal void".

The way I read the discussion, there is no consensus for getting this
into PG12, but there does seem to be one for PG13.

I'll push this to the first post-PG12 CF when one appears.  Can we just
go ahead and create a 2019-07 CF now?  I know we generally discuss this
at PGCon, but we can always rename it, right?

We can, I guess. Are you saying you're likely to forget? :P

For 12, it would probably be good to have a more clear deprecation notice into the documentations. Is that something on your list of things to work on, or should someone else (like, uh, me)? 

--

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 12/19/18 9:02 PM, Magnus Hagander wrote:
> On Wed, Dec 19, 2018 at 5:38 PM David Steele <david@pgmasters.net 
> 
>     I'll push this to the first post-PG12 CF when one appears.  Can we just
>     go ahead and create a 2019-07 CF now?  I know we generally discuss this
>     at PGCon, but we can always rename it, right?
> 
> 
> We can, I guess. Are you saying you're likely to forget? :P

Not likely to forget, but I figured it's not good to have it in at the 
start of the next CF.  People might think they can review it!

> > For 12, it would probably be good to have a more clear deprecation 
> notice into the documentations. Is that something on your list of things 
> to work on, or should someone else (like, uh, me)?

I'm planning to do that -- I meant to mention that.  I may not have time 
until after the new year, but I doubt it's something that will cause a 
big fuss.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Michael Paquier
Date:
On Wed, Dec 19, 2018 at 06:38:00PM +0200, David Steele wrote:
> I'll push this to the first post-PG12 CF when one appears.  Can we just go
> ahead and create a 2019-07 CF now?  I know we generally discuss this at
> PGCon, but we can always rename it, right?

The schedule gets decided at the developer meeting, still the commit
fests created can have their name and start/end dates modified as needed
So, I think that it is not an issue to add one ahead now.  And here you
go:
https://commitfest.postgresql.org/23/

This request was going to pop out at some point anyway in the follow-up
months.
--
Michael

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 12/20/18 1:35 AM, Michael Paquier wrote:
> On Wed, Dec 19, 2018 at 06:38:00PM +0200, David Steele wrote:
>> I'll push this to the first post-PG12 CF when one appears.  Can we just go
>> ahead and create a 2019-07 CF now?  I know we generally discuss this at
>> PGCon, but we can always rename it, right?
> 
> The schedule gets decided at the developer meeting, still the commit
> fests created can have their name and start/end dates modified as needed
> So, I think that it is not an issue to add one ahead now.  And here you
> go:
> https://commitfest.postgresql.org/23/
> 
> This request was going to pop out at some point anyway in the follow-up
> months.

Thanks, Michael.  Getting this error, not sure what it means:

Cannot move patch to the same commitfest, and multiple future 
commitfests exist!

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Michael Paquier
Date:
On Thu, Dec 20, 2018 at 12:29:48PM +0200, David Steele wrote:
> Cannot move patch to the same commitfest, and multiple future commitfests
> exist!

I am not sure what it means either.  What if you just mark the existing
entry as returned with feedback, and create a new one ahead?
--
Michael

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 12/21/18 2:01 AM, Michael Paquier wrote:
> On Thu, Dec 20, 2018 at 12:29:48PM +0200, David Steele wrote:
>> Cannot move patch to the same commitfest, and multiple future commitfests
>> exist!
> 
> I am not sure what it means either.  What if you just mark the existing
> entry as returned with feedback, and create a new one ahead?

Yeah, I can do that, but it would be nice to know why this is not 
working.  It would also be nice to preserve the history.

Magnus?

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Fri, Dec 21, 2018 at 1:18 AM David Steele <david@pgmasters.net> wrote:
> On 12/21/18 2:01 AM, Michael Paquier wrote:
> > On Thu, Dec 20, 2018 at 12:29:48PM +0200, David Steele wrote:
> >> Cannot move patch to the same commitfest, and multiple future commitfests
> >> exist!
> >
> > I am not sure what it means either.  What if you just mark the existing
> > entry as returned with feedback, and create a new one ahead?
>
> Yeah, I can do that, but it would be nice to know why this is not
> working.  It would also be nice to preserve the history.
>
> Magnus?

I think what it means is that it doesn't know which CommitFest is the
"next" one.  If the patch were currently in the in-progress
CommitFest, the Open one (if any) would be the next one.  Otherwise,
if there were only one Future CommitFest, that would be the next one.
But right now you're moving a patch that is already in the Open
CommitFest and there are two 2 Future CommitFests, so it doesn't know
which one to pick.

I think the old CommitFest application let you set the CommitFest via
the Edit screen also, so you could pick a specific CommitFest to
target.  But that doesn't seem to exist in Magnus's version.

If you just wait until the Open CommitFest is marked In Progress and
the next one is marked Open, it should work in any case.

I think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 12/21/18 6:43 PM, Robert Haas wrote:
> 
> If you just wait until the Open CommitFest is marked In Progress and
> the next one is marked Open, it should work in any case.

I'll give that a try since it's only a few more days.

> I think.

Well, it's an uncertain world.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:


On Fri, Dec 21, 2018 at 5:43 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Dec 21, 2018 at 1:18 AM David Steele <david@pgmasters.net> wrote:
> On 12/21/18 2:01 AM, Michael Paquier wrote:
> > On Thu, Dec 20, 2018 at 12:29:48PM +0200, David Steele wrote:
> >> Cannot move patch to the same commitfest, and multiple future commitfests
> >> exist!
> >
> > I am not sure what it means either.  What if you just mark the existing
> > entry as returned with feedback, and create a new one ahead?
>
> Yeah, I can do that, but it would be nice to know why this is not
> working.  It would also be nice to preserve the history.
>
> Magnus?

I think what it means is that it doesn't know which CommitFest is the
"next" one.  If the patch were currently in the in-progress
CommitFest, the Open one (if any) would be the next one.  Otherwise,
if there were only one Future CommitFest, that would be the next one.
But right now you're moving a patch that is already in the Open
CommitFest and there are two 2 Future CommitFests, so it doesn't know
which one to pick.

I think the old CommitFest application let you set the CommitFest via
the Edit screen also, so you could pick a specific CommitFest to
target.  But that doesn't seem to exist in Magnus's version.

That was a specifically requested change, and IIRC has been discussed before whenever people run into it :)

The basics is, there is not supposed to be more than one Future commitfest, in order to keep the workflow as simple as possible. When there is, all sorts of bad things happen. What we really should have at that point is a blocker so you can't *create* more than one of them, but right now it naively assumes nobody does...

--

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 12/27/18 6:53 PM, Magnus Hagander wrote:
> On Fri, Dec 21, 2018 at 5:43 PM Robert Haas <robertmhaas@gmail.com 
> 
>     I think the old CommitFest application let you set the CommitFest via
>     the Edit screen also, so you could pick a specific CommitFest to
>     target.  But that doesn't seem to exist in Magnus's version.
> 
> The basics is, there is not supposed to be more than one Future 
> commitfest, in order to keep the workflow as simple as possible. When 
> there is, all sorts of bad things happen. What we really should have at 
> that point is a blocker so you can't *create* more than one of them, but 
> right now it naively assumes nobody does...

Which is ironic since I specifically requested that one be created.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> I too only learned about this recently, while the problem with exclusive
> backups has been known at least since 2008 (c979a1fe), and nobody felt
> this to be a terrible problem back then.

My recollection was that back then it wasn't clear what to *do* about
the problem.  Once we had a solution (non-exclusive backups) we
deprecated the exclusive backup mechanism.

> I believe that the danger is greatly overrated.  It is not like you end
> up with a corrupted database after a crash, and you get a pretty helpful
> error message.  Many people are happy enough to live with that.

Unfortunately, people get corrupted databases all the time because the
backups are corrupted.  The problem with databases not restarting
correctly is another issue with the exclusive backup method but it's
certainly not the only one.

> I'm on board with deprecating and removing it eventually, but I see no
> problem in waiting for the customary 5 years.

We don't actually have a 'customary 5 years' rule of any sort.

> And yes, a prominent warning in the next major release notes would be
> a good thing.

This is two years too late, at best.  That is, maybe it would have made
sense to highlight that the exclusive backup method was deprecated when
it was made so, but that ship has sailed.  Further, we don't put out
announcements saying "we're going to remove X in the next release" and I
don't see it making sense to start now.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> On 12/12/2018 05:31, Robert Haas wrote:
> > Most of the features I've been involved in removing have been
> > deprecated for 5+ years.  The first release where this one was
> > deprecated was only 2 years ago.  So it feels dramatically faster to
> > me than what I think we have typically done.
>
> I was just looking this up as well, and I find it too fast.  The
> nonexclusive backups were introduced in 9.6.  So I'd say that we could
> remove the exclusive ones when 9.5 goes EOL.  (That would mean this
> patch could be submitted for PostgreSQL 13, since 9.5 will go out of
> support around the time PG13 would be released.)

I don't agree with either the notion that we have to wait 5 years in
this case or that we've only had a good alternative to the exclusive
backup mode since 9.5 as we've had pg_basebackup since 9.1.

Thanks,

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> Clearly, not having to do that at all is better, but if this is all
> there is to it, then I'm confused by the characterizations of how awful
> and terrible this feature is and how we must rush to remove it.

It's not all there is to it though.

This issue leads to extended downtime regularly and is definitely a huge
'gotcha' for users, even if you don't want to call it outright broken,
but the other issue is that our documentation is ridiculously
complicated around how to do a backup properly because of this and that
also leads to the reality that it's difficult to make improvements to it
because every sentence has to consider both methods, and that's really
assuming that users actively read through the detailed documentation
instead of just looking at those nice simple 'pg_start_backup' and
'pg_stop_backup' methods and use them thinking that's all that's
required.

We see this *all* the time, on the lists, in blog posts (even those from
somewhat reputable companies...), and in other places.  The exclusive
backup method is a huge foot-gun for new users to PostgreSQL and leads
to downtime, corrupt backups, and then corrupt databases when backups
get restored.

It really does need to go, and the sooner, the better.

Thanks,

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Thu, Dec 13, 2018 at 2:29 PM David Steele <david@pgmasters.net> wrote:
> > We could leave the third parameter (changing the default to false) and
> > error if it has any value but false.  It's a bit ugly but it does
> > maintain compatibility with the current non-exclusive backup interface.
> >  And, the third parameter would never need to be specified unless we add
> > a fourth.
> >
> > Or we could add a new function and deprecate this one -- but what to
> > call it?

... and would people really realize that they have to do more than just
change the function name?  I'm not sure.

> > I agree it's not very cool to break code for users who actually *did*
> > migrate to non-exclusive backups.

This bothers me much less.  We change our APIs between major releases
from time-to-time and people need to update their code to handle that.

> Uh, yeah.  If you do that, you're not just forcing people off the old
> interface -- you're making *everybody* update their stuff again.  And
> having to have conditional logic to handle different releases presents
> its own set of dangers. IMHO, the biggest enemy in this area BY FAR is
> human error, not the details of how the interface works. No amount of
> providing a better-designed interface will compensate for the
> confusion factor involved in changing it.

I don't agree with the argument that conditional logic for different
releases is an issue- we already require people to have that all over
the place and, while it's a bit annoying, it's part and parcel for
working with multiple major versions of PG.

> My vote is ... when we actually deprecate this, change nothing at the
> SQL level initially, but just throw an error if the "exclusive"
> argument isn't false:

> ERROR: exclusive backup mode is no longer supported

> That will require everyone to use the three-argument form of
> pg_start_backup(), but I think that's good, because if we accept the
> one and two argument forms, how do we actually know that the user
> updated their code?  If you want to actually force people to switch to
> the non-exclusive mode, you have to make sure that anything that might
> be a residual request for exclusive backup mode errors out.  If the
> same SQL just does something different, the user might fail to notice.

This is an argument that I can agree with though.  We certainly don't
want existing users of the exclusive mode to think "everything is fine!"
after it's been ripped out.  I hate having to leave old API cruft around
like an argument that's required to be passed in and required to have a
specific value but it does seem the best way to make it extremely clear
to users of the exclusive mode that they really have to adjust their
code.

Perhaps in some future version we'll redefine the API more broadly and
be able to rip out this API entirely, throwing away such old cruft as we
do it, but until then I think we can live with it.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Laurenz Albe
Date:
Stephen Frost wrote:
> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> > Clearly, not having to do that at all is better, but if this is all
> > there is to it, then I'm confused by the characterizations of how awful
> > and terrible this feature is and how we must rush to remove it.
> 
> It's not all there is to it though.
> 
> This issue leads to extended downtime regularly and is definitely a huge
> 'gotcha' for users, even if you don't want to call it outright broken,

Only if PostgreSQL crashes regularly, right?

Yours,
Laurenz Albe



Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> Stephen Frost wrote:
> > * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> > > Clearly, not having to do that at all is better, but if this is all
> > > there is to it, then I'm confused by the characterizations of how awful
> > > and terrible this feature is and how we must rush to remove it.
> >
> > It's not all there is to it though.
> >
> > This issue leads to extended downtime regularly and is definitely a huge
> > 'gotcha' for users, even if you don't want to call it outright broken,
>
> Only if PostgreSQL crashes regularly, right?

This happens anytime a backup is in progress and the system crashes in
any way- PostgreSQL going down, the container Postgres is running in,
the server Postgres is running on, or if the filesystem that Postgres is
running on goes away, etc.

There's certainly no shortage of cases where this can happen.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Andres Freund
Date:
On 2019-01-05 13:19:20 -0500, Stephen Frost wrote:
> Greetings,
> 
> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> > On 12/12/2018 05:31, Robert Haas wrote:
> > > Most of the features I've been involved in removing have been
> > > deprecated for 5+ years.  The first release where this one was
> > > deprecated was only 2 years ago.  So it feels dramatically faster to
> > > me than what I think we have typically done.
> > 
> > I was just looking this up as well, and I find it too fast.  The
> > nonexclusive backups were introduced in 9.6.  So I'd say that we could
> > remove the exclusive ones when 9.5 goes EOL.  (That would mean this
> > patch could be submitted for PostgreSQL 13, since 9.5 will go out of
> > support around the time PG13 would be released.)
> 
> I don't agree with either the notion that we have to wait 5 years in
> this case or that we've only had a good alternative to the exclusive
> backup mode since 9.5 as we've had pg_basebackup since 9.1.

I don't agree with a general 5 year deprecation window either. But it
seems pretty clear that there's no majority for removing exclusive
backups in v12.   I think it'd be good to make the warning about its
impending death more explicit, but otherwise mark this CF entry either
as rejected or returned with feedback.

Greetings,

Andres Freund


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 2/16/19 5:57 AM, Andres Freund wrote:
> On 2019-01-05 13:19:20 -0500, Stephen Frost wrote:
>> Greetings,
>>
>> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
>>> On 12/12/2018 05:31, Robert Haas wrote:
>>>> Most of the features I've been involved in removing have been
>>>> deprecated for 5+ years.  The first release where this one was
>>>> deprecated was only 2 years ago.  So it feels dramatically faster to
>>>> me than what I think we have typically done.
>>>
>>> I was just looking this up as well, and I find it too fast.  The
>>> nonexclusive backups were introduced in 9.6.  So I'd say that we could
>>> remove the exclusive ones when 9.5 goes EOL.  (That would mean this
>>> patch could be submitted for PostgreSQL 13, since 9.5 will go out of
>>> support around the time PG13 would be released.)
>>
>> I don't agree with either the notion that we have to wait 5 years in
>> this case or that we've only had a good alternative to the exclusive
>> backup mode since 9.5 as we've had pg_basebackup since 9.1.
> 
> I don't agree with a general 5 year deprecation window either. But it
> seems pretty clear that there's no majority for removing exclusive
> backups in v12.   I think it'd be good to make the warning about its
> impending death more explicit, but otherwise mark this CF entry either
> as rejected or returned with feedback.

I think there is support for the patch in PG13 so I was planning to move 
it out of the March CF to the first PG13 CF as soon as the app will 
allow it, i.e., once there is only a single open CF.

Thanks,
-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:
On Mon, Feb 18, 2019 at 6:13 AM David Steele <david@pgmasters.net> wrote:
On 2/16/19 5:57 AM, Andres Freund wrote:
> On 2019-01-05 13:19:20 -0500, Stephen Frost wrote:
>> Greetings,
>>
>> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
>>> On 12/12/2018 05:31, Robert Haas wrote:
>>>> Most of the features I've been involved in removing have been
>>>> deprecated for 5+ years.  The first release where this one was
>>>> deprecated was only 2 years ago.  So it feels dramatically faster to
>>>> me than what I think we have typically done.
>>>
>>> I was just looking this up as well, and I find it too fast.  The
>>> nonexclusive backups were introduced in 9.6.  So I'd say that we could
>>> remove the exclusive ones when 9.5 goes EOL.  (That would mean this
>>> patch could be submitted for PostgreSQL 13, since 9.5 will go out of
>>> support around the time PG13 would be released.)
>>
>> I don't agree with either the notion that we have to wait 5 years in
>> this case or that we've only had a good alternative to the exclusive
>> backup mode since 9.5 as we've had pg_basebackup since 9.1.
>
> I don't agree with a general 5 year deprecation window either. But it
> seems pretty clear that there's no majority for removing exclusive
> backups in v12.   I think it'd be good to make the warning about its
> impending death more explicit, but otherwise mark this CF entry either
> as rejected or returned with feedback.

I think there is support for the patch in PG13 so I was planning to move
it out of the March CF to the first PG13 CF as soon as the app will
allow it, i.e., once there is only a single open CF.


Agreed, and I think we should also update the documentation for 12 with the suggested more explicit mention of the deprecation. 


--

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Magnus Hagander (magnus@hagander.net) wrote:
> On Mon, Feb 18, 2019 at 6:13 AM David Steele <david@pgmasters.net> wrote:
> > On 2/16/19 5:57 AM, Andres Freund wrote:
> > > On 2019-01-05 13:19:20 -0500, Stephen Frost wrote:
> > >> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> > >>> On 12/12/2018 05:31, Robert Haas wrote:
> > >>>> Most of the features I've been involved in removing have been
> > >>>> deprecated for 5+ years.  The first release where this one was
> > >>>> deprecated was only 2 years ago.  So it feels dramatically faster to
> > >>>> me than what I think we have typically done.
> > >>>
> > >>> I was just looking this up as well, and I find it too fast.  The
> > >>> nonexclusive backups were introduced in 9.6.  So I'd say that we could
> > >>> remove the exclusive ones when 9.5 goes EOL.  (That would mean this
> > >>> patch could be submitted for PostgreSQL 13, since 9.5 will go out of
> > >>> support around the time PG13 would be released.)
> > >>
> > >> I don't agree with either the notion that we have to wait 5 years in
> > >> this case or that we've only had a good alternative to the exclusive
> > >> backup mode since 9.5 as we've had pg_basebackup since 9.1.
> > >
> > > I don't agree with a general 5 year deprecation window either. But it
> > > seems pretty clear that there's no majority for removing exclusive
> > > backups in v12.   I think it'd be good to make the warning about its
> > > impending death more explicit, but otherwise mark this CF entry either
> > > as rejected or returned with feedback.
> >
> > I think there is support for the patch in PG13 so I was planning to move
> > it out of the March CF to the first PG13 CF as soon as the app will
> > allow it, i.e., once there is only a single open CF.
>
> Agreed, and I think we should also update the documentation for 12 with the
> suggested more explicit mention of the deprecation.

+1.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 2/22/19 6:32 PM, Stephen Frost wrote:
> * Magnus Hagander (magnus@hagander.net) wrote:
>> On Mon, Feb 18, 2019 at 6:13 AM David Steele <david@pgmasters.net> wrote:
>>> On 2/16/19 5:57 AM, Andres Freund wrote:

>>>> I don't agree with a general 5 year deprecation window either. But it
>>>> seems pretty clear that there's no majority for removing exclusive
>>>> backups in v12.   I think it'd be good to make the warning about its
>>>> impending death more explicit, but otherwise mark this CF entry either
>>>> as rejected or returned with feedback.
>>>
>>> I think there is support for the patch in PG13 so I was planning to move
>>> it out of the March CF to the first PG13 CF as soon as the app will
>>> allow it, i.e., once there is only a single open CF.
>>
>> Agreed, and I think we should also update the documentation for 12 with the
>> suggested more explicit mention of the deprecation.
> 
> +1.

I'm working on this for the March CF.  Ideally we would back-patch it to 
9.6.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Fujii Masao
Date:
On Tue, Nov 27, 2018 at 12:13 PM David Steele <david@pgmasters.net> wrote:
>
> Hackers,
>
> I propose we remove the deprecated exclusive backup mode of
> pg_start_backup()/pg_stop_backup() for Postgres 12.

-1 for the removal. I think that there are still many users of an exclusive
backup API, and it's not so easy to migrate their backup scripts so that
only non-exclusive one is used because of the reason I wrote in another thread.
https://postgr.es/m/CAHGQGwHUkEbkVexVfWNLjmq2rzOS_SHYMiECt+KBn-cBPq5Arg@mail.gmail.com

Regards,

-- 
Fujii Masao


Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Fri, Feb 22, 2019 at 12:35 PM Fujii Masao <masao.fujii@gmail.com> wrote:
> -1 for the removal. I think that there are still many users of an exclusive
> backup API, and it's not so easy to migrate their backup scripts so that
> only non-exclusive one is used because of the reason I wrote in another thread.
> https://postgr.es/m/CAHGQGwHUkEbkVexVfWNLjmq2rzOS_SHYMiECt+KBn-cBPq5Arg@mail.gmail.com

Yeah, those are interesting points.  Unfortunately, I think something
as simple as your proposed...

psql -c "SELECT pg_start_backup()"
rsync, cp, tar, storage backup, or something
psql -c "SELECT pg_stop_backup()"

...doesn't have much chance of being correct.  If you aren't checking
whether pg_start_backup() throws an error, for example, you have got a
very failure-prone set-up.  That being said, it's possible to fix that
problem using some shell logic, whereas the problem of keeping a
connection open for the duration of the backup from the shell is much
harder.  I recently ran across a case where someone attempted it but
did not get the logic entirely right, with rather painful
consequences.

Now you might say that people should not write their own tools and
just use professionally produced ones.  But I don't really agree with
that, and whatever we think people SHOULD do, there are in fact a lot
of people using their own tools.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Fri, Feb 22, 2019 at 12:35 PM Fujii Masao <masao.fujii@gmail.com> wrote:
> > -1 for the removal. I think that there are still many users of an exclusive
> > backup API, and it's not so easy to migrate their backup scripts so that
> > only non-exclusive one is used because of the reason I wrote in another thread.
> > https://postgr.es/m/CAHGQGwHUkEbkVexVfWNLjmq2rzOS_SHYMiECt+KBn-cBPq5Arg@mail.gmail.com
>
> Yeah, those are interesting points.  Unfortunately, I think something
> as simple as your proposed...
>
> psql -c "SELECT pg_start_backup()"
> rsync, cp, tar, storage backup, or something
> psql -c "SELECT pg_stop_backup()"
>
> ...doesn't have much chance of being correct.  If you aren't checking
> whether pg_start_backup() throws an error, for example, you have got a
> very failure-prone set-up.  That being said, it's possible to fix that
> problem using some shell logic, whereas the problem of keeping a
> connection open for the duration of the backup from the shell is much
> harder.  I recently ran across a case where someone attempted it but
> did not get the logic entirely right, with rather painful
> consequences.
>
> Now you might say that people should not write their own tools and
> just use professionally produced ones.  But I don't really agree with
> that, and whatever we think people SHOULD do, there are in fact a lot
> of people using their own tools.

Yes, they are- but I'd argue that's specifically because the
old/deprecated/broken API makes it *look* safe, even though it isn't.

Getting a solid and resiliant backup to work from a shell script is, imv
anyway (though I might have a bit of experience, having tried numerous
times myself and then realizing that it just isn't practical...), a
downright fool's errand.

Perhaps there's some way we can fix that so that people can write a lazy
rsync-based backup script and use it, but I don't see anyone seriously
working on that and I don't think that's because they just really like
to make things difficult for people but rather because it really is
*hard* to do, if not downright impossible.

What we need is a backup tool included in core that users feel
comfortable using instead of trying to write their own.  I don't think
they write their own because it's somehow better than the existing tools
or because they really think it's a good idea- it's just what looks like
the "thing to do" based on our documentation and the API we provide
today.  Sure, pg_basebackup is good for smaller environments and I
always encourage people to use it when they want something that's in
core that is safe to use and that does things correctly, but it's got
some pretty serious limitations due to using the replication protocol.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Christophe Pettus
Date:

> On Feb 22, 2019, at 15:18, Stephen Frost <sfrost@snowman.net> wrote:
> Getting a solid and resiliant backup to work from a shell script is, imv
> anyway (though I might have a bit of experience, having tried numerous
> times myself and then realizing that it just isn't practical...), a
> downright fool's errand.

The reality, though, is that there are a lot of organizations who have invested time and effort into getting a backup
strategyworking using the existing APIs, and there will be quite a bit of pushback against the version in which the
existingexclusive API is removed. 

Some of those will be able to move to non-exclusive backups easily; others won't.  For the ones that can't move easily,
thereaction will not be, "PostgreSQL version x has a safer backup API"; it will be "PostgreSQL version x broke our
backups,so we're not upgrading to it." 

Rather than deprecate the existing API, I'd rather see the documentation updated to discuss the danger cases.
--
-- Christophe Pettus
   xof@thebuild.com



Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Christophe Pettus (xof@thebuild.com) wrote:
> > On Feb 22, 2019, at 15:18, Stephen Frost <sfrost@snowman.net> wrote:
> > Getting a solid and resiliant backup to work from a shell script is, imv
> > anyway (though I might have a bit of experience, having tried numerous
> > times myself and then realizing that it just isn't practical...), a
> > downright fool's errand.
>
> The reality, though, is that there are a lot of organizations who have invested time and effort into getting a backup
strategyworking using the existing APIs, and there will be quite a bit of pushback against the version in which the
existingexclusive API is removed. 

Do they realize how that existing backup strategy is flawed?  I doubt
it, and when they discover how it's broken, I don't think they're going
to be thanking us for letting them continue to use it.

> Some of those will be able to move to non-exclusive backups easily; others won't.  For the ones that can't move
easily,the reaction will not be, "PostgreSQL version x has a safer backup API"; it will be "PostgreSQL version x broke
ourbackups, so we're not upgrading to it." 

We don't cater to this line of argument when it comes to breaking
changes in the backend, or when we break monitoring scripts, and I don't
see a reason why we should do so here.  They aren't required to upgrade
immediately- we provide 5 *years* of support for major versions which we
release, and we're going to give advance warning of this change that's
even beyond that 5 years.  I have essentially zero sympathy for
organizations which refuse to allocate even the bit of time necessary to
address this change for over 5+ years.

> Rather than deprecate the existing API, I'd rather see the documentation updated to discuss the danger cases.

Ok, then please do so, and please be prepared to continue to maintain
the documentation of both methods moving forward, because others have
tried and have (rightfully, in my opinion) decided that it's frankly not
worth the effort and ultimately just terribly confusing for users that
we have these two different backup methods and even just updating the
documentation for one or the other is downright painful (to the point
that people litterally give up on it).  That really isn't a good place
to be in.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Andres Freund
Date:
Hi,,

On 2019-02-24 11:52:54 -0800, Christophe Pettus wrote:
> > On Feb 22, 2019, at 15:18, Stephen Frost <sfrost@snowman.net> wrote:
> > Getting a solid and resiliant backup to work from a shell script is, imv
> > anyway (though I might have a bit of experience, having tried numerous
> > times myself and then realizing that it just isn't practical...), a
> > downright fool's errand.
> 
> The reality, though, is that there are a lot of organizations who have
> invested time and effort into getting a backup strategy working using
> the existing APIs, and there will be quite a bit of pushback against
> the version in which the existing exclusive API is removed.
> 
> Some of those will be able to move to non-exclusive backups easily;
> others won't.  For the ones that can't move easily, the reaction will
> not be, "PostgreSQL version x has a safer backup API"; it will be
> "PostgreSQL version x broke our backups, so we're not upgrading to
> it."

I think it also depends on how we remove exclusive backups. If we do it
by adding a 'pg_run_hot_backup --conection-options -- shell_command_to_copy'
it ought to be pretty simple to fix scripts. Whereas right now, with the
requirement to manually have a postgres connection alive, which is
annoying to do from shell, is not the case.

I think if we'd introduced a command like that, we'd have seen a higher
adoption of non-exclusive backups.


> Rather than deprecate the existing API, I'd rather see the documentation updated to discuss the danger cases.

You can't work around some of danger cases (crash in the wrong moment,
your database doesn't come up properly), so I don't think docs really
address the problem.

Greetings,

Andres Freund


Re: Remove Deprecated Exclusive Backup Mode

From
Christophe Pettus
Date:

> On Feb 24, 2019, at 12:00, Stephen Frost <sfrost@snowman.net> wrote:
>
> Do they realize how that existing backup strategy is flawed?

Undoubtedly, some do, some don't.  However, given that it has been the *only* backup API for a very long time, many
organizationshave spent a lot of time closing all of the holes. 

It's not impossible to do safe backups with the existing API.  Unquestionably, there are installations doing backups
thatmight end up with a silently badly one, but it's entirely possible to do that unsafely (in many of the same ways)
withthe non-exclusively API. 

The installations that need to fix the scripts are also exactly the ones that can't use pg_basebackup or another
pre-packagedsolution, usually because they have a specific way of taking the file system copy (SAN snapshot, etc.) that
thosedon't support. 

> We don't cater to this line of argument when it comes to breaking
> changes in the backend, or when we break monitoring scripts, and I don't
> see a reason why we should do so here.

Those cases are not analogous.

1. Backend APIs are declared non-stable, and do not have a wide audience compared to backing up the database.
2. Monitoring scripts, while important, are not as critical as the backup system.  (And, in fact, I didn't agree with
breakingthose views either, but that's another discussion.) 

> Ok, then please do so, and please be prepared to continue to maintain
> the documentation of both methods moving forward, because others have
> tried and have (rightfully, in my opinion) decided that it's frankly not
> worth the effort and ultimately just terribly confusing for users that
> we have these two different backup methods and even just updating the
> documentation for one or the other is downright painful (to the point
> that people litterally give up on it).

We're going to have to do that anyway.  For as long as we are maintaining the documentation on a version that has both
APIs,we're going to have to say, "Don't use this one, and *here's why*."  Saying, "Don't use this one because we said
so"when it is an API of long standing that works just as it always did isn't going to cut it. 

--
-- Christophe Pettus
   xof@thebuild.com



Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Christophe Pettus (xof@thebuild.com) wrote:
> > On Feb 24, 2019, at 12:00, Stephen Frost <sfrost@snowman.net> wrote:
> >
> > Do they realize how that existing backup strategy is flawed?
>
> Undoubtedly, some do, some don't.  However, given that it has been the *only* backup API for a very long time, many
organizationshave spent a lot of time closing all of the holes. 

No, it *hasn't* been the only backup API for a long time- that was only
true up until 9.6 was released, since then we've had both, and it's made
everything a downright mess because of exactly these arguments.

> It's not impossible to do safe backups with the existing API.  Unquestionably, there are installations doing backups
thatmight end up with a silently badly one, but it's entirely possible to do that unsafely (in many of the same ways)
withthe non-exclusively API. 

Yes, it *is* impossible to do safe backups with the existing API.  There
is an unquestionable race condition where a system restart will cause
your system to not come back up without you going in and removing the
backup_label file- and the only way you make that race window small is
to remove the backup_label file right after you run pg_start_backup and
copy it, and then PUT IT BACK at the end before you call pg_stop_backup,
which is insane, but otherwise the 'race window' is the ENTIRE length of
the backup.

This is the reason the non-exclusive backup mode *exists*.  All of the
effort that went into building the non-exclusive backup mode wasn't done
just for the fun of it.

> The installations that need to fix the scripts are also exactly the ones that can't use pg_basebackup or another
pre-packagedsolution, usually because they have a specific way of taking the file system copy (SAN snapshot, etc.) that
thosedon't support. 

I'm not going to bother going into the issues that SAN snapshots have
here, because it's largely orthogonal to this discussion.  What I'll say
is that they'll have over 5 years to adjust those scripts and I don't
see an issue with that.

> > We don't cater to this line of argument when it comes to breaking
> > changes in the backend, or when we break monitoring scripts, and I don't
> > see a reason why we should do so here.
>
> Those cases are not analogous.

I disagree *entirely*.

> 1. Backend APIs are declared non-stable, and do not have a wide audience compared to backing up the database.

*ALL* of our APIs are declared non-stable between major releases.
That's why we have *major* and *minor* releases.

> 2. Monitoring scripts, while important, are not as critical as the backup system.  (And, in fact, I didn't agree with
breakingthose views either, but that's another discussion.) 

Agreed, we should care even more about making sure that the tools we
provide to perform a backup are reliable, which is a good reason to
remove the broken and deprecated exclusive API.

> > Ok, then please do so, and please be prepared to continue to maintain
> > the documentation of both methods moving forward, because others have
> > tried and have (rightfully, in my opinion) decided that it's frankly not
> > worth the effort and ultimately just terribly confusing for users that
> > we have these two different backup methods and even just updating the
> > documentation for one or the other is downright painful (to the point
> > that people litterally give up on it).
>
> We're going to have to do that anyway.  For as long as we are maintaining the documentation on a version that has
bothAPIs, we're going to have to say, "Don't use this one, and *here's why*."  Saying, "Don't use this one because we
saidso" when it is an API of long standing that works just as it always did isn't going to cut it. 

We have existing documentation for the versions that have both APIs.  It
sucks, it's horribly complicated, and no one wants to work on it because
of that.

Sure, it'd be nice if someone would fix it to list out the problems with
the exclusive API, but that requires someone wanting to spend time on
it.  If that's not going to happen, I certainly don't think that means
we should *keep* it.

THanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Laurenz Albe
Date:
Andres Freund wrote:
> On 2019-02-24 11:52:54 -0800, Christophe Pettus wrote:
> > > On Feb 22, 2019, at 15:18, Stephen Frost <sfrost@snowman.net> wrote:
> > > Getting a solid and resiliant backup to work from a shell script is, imv
> > > anyway (though I might have a bit of experience, having tried numerous
> > > times myself and then realizing that it just isn't practical...), a
> > > downright fool's errand.

Here is my run with the fool's cap on:
https://github.com/cybertec-postgresql/safe-backup

> > The reality, though, is that there are a lot of organizations who have
> > invested time and effort into getting a backup strategy working using
> > the existing APIs, and there will be quite a bit of pushback against
> > the version in which the existing exclusive API is removed.
> > 
> > Some of those will be able to move to non-exclusive backups easily;
> > others won't.  For the ones that can't move easily, the reaction will
> > not be, "PostgreSQL version x has a safer backup API"; it will be
> > "PostgreSQL version x broke our backups, so we're not upgrading to
> > it."

From the reactions I see in the field, I'd agree.

> I think it also depends on how we remove exclusive backups. If we do it
> by adding a 'pg_run_hot_backup --conection-options -- shell_command_to_copy'
> it ought to be pretty simple to fix scripts. Whereas right now, with the
> requirement to manually have a postgres connection alive, which is
> annoying to do from shell, is not the case.
> 
> I think if we'd introduced a command like that, we'd have seen a higher
> adoption of non-exclusive backups.

I think that is a good idea and will do the trick for many people.

It will be harder for those whose backup solution is driven by
a central backup software that backs up the file system and just offers
"pre-backup" and "post-backup" hooks.

Yours,
Laurenz Albe



Re: Remove Deprecated Exclusive Backup Mode

From
Laurenz Albe
Date:
Stephen Frost wrote:
> Yes, it *is* impossible to do safe backups with the existing API.  There
> is an unquestionable race condition where a system restart will cause
> your system to not come back up without you going in and removing the
> backup_label file- and the only way you make that race window small is
> to remove the backup_label file right after you run pg_start_backup and
> copy it, and then PUT IT BACK at the end before you call pg_stop_backup,
> which is insane, but otherwise the 'race window' is the ENTIRE length of
> the backup.

I just have an idea:

What about an option to keep WAL around for the duration of an exclusive backup?

That way PostgreSQL can still restart after a crash.  It will take longer than
expected, but it will work.  But then, perhaps the long recovery time is only
marginally better than having to manually delete the backup_label file...

Yours,
Laurenz Albe



Re: Remove Deprecated Exclusive Backup Mode

From
Christophe Pettus
Date:

> On Feb 24, 2019, at 13:00, Stephen Frost <sfrost@snowman.net> wrote:
>
> No, it *hasn't* been the only backup API for a long time- that was only
> true up until 9.6 was released, since then we've had both, and it's made
> everything a downright mess because of exactly these arguments.

9.6 has been out for about 2.5 years.  How long was the old version of pg_start_backup() the only interface?
Considerablylonger. 

> Yes, it *is* impossible to do safe backups with the existing API.

That's an overstatement.  "The existing interface has failure conditions that are hard to work around" is true.  Saying
"it'simpossible to do safe backups" implies that no pre-9.6 system has ever been backed up.  If we take that line, it's
justgoing to create a "Oh, come *on*" reaction from users. 

> What I'll say
> is that they'll have over 5 years to adjust those scripts and I don't
> see an issue with that.

That's not quite the situation.  What they will have is a choice between upgrading, and or their existing scripts
continuingto work.  That choice will start ticking the instant a release without the old interface comes out.  The
back-pressureon upgrading is already very high; this is going to put a lot of installations into a bind, and a lot will
makethe decision not to upgrade because of that. 

> *ALL* of our APIs are declared non-stable between major releases.
> That's why we have *major* and *minor* releases.

I note that a word game is being played here.  We're calling pg_start_backup() "an API", as if it is the same as, say,
theGiST C API, but it's a SQL level construct.  SQL level constructs are not promises we need to keep forever, of
course,but they're not the same class of promise as a C API. 

> Sure, it'd be nice if someone would fix it to list out the problems with
> the exclusive API, but that requires someone wanting to spend time on
> it.

I'll take a turn at it.

--
-- Christophe Pettus
   xof@thebuild.com



Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> Stephen Frost wrote:
> > Yes, it *is* impossible to do safe backups with the existing API.  There
> > is an unquestionable race condition where a system restart will cause
> > your system to not come back up without you going in and removing the
> > backup_label file- and the only way you make that race window small is
> > to remove the backup_label file right after you run pg_start_backup and
> > copy it, and then PUT IT BACK at the end before you call pg_stop_backup,
> > which is insane, but otherwise the 'race window' is the ENTIRE length of
> > the backup.
>
> I just have an idea:
>
> What about an option to keep WAL around for the duration of an exclusive backup?
>
> That way PostgreSQL can still restart after a crash.  It will take longer than
> expected, but it will work.  But then, perhaps the long recovery time is only
> marginally better than having to manually delete the backup_label file...

I'm afraid that we'd end up with many, many complaints about people
running out of disk space on WAL when they are trying to take a backup..

I do applaud your efforts to think of a better solution but I'm afraid
that isn't really workable.  While crashing with a backup_label in place
definitely sucks and makes recovering from that not fun, it's probably
better than having people run out of disk space and having the system
PANIC from that during what would otherwise be perfectly normal
operation.

That would also seem like a bit of an odd difference between the
exclusive and non-exclusive backup methods... and another things we'd
have to write up documentation for if we kept both methods around to try
and explain to users and that is just not a pleasant thought.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Christophe Pettus (xof@thebuild.com) wrote:
> > On Feb 24, 2019, at 13:00, Stephen Frost <sfrost@snowman.net> wrote:
> >
> > No, it *hasn't* been the only backup API for a long time- that was only
> > true up until 9.6 was released, since then we've had both, and it's made
> > everything a downright mess because of exactly these arguments.
>
> 9.6 has been out for about 2.5 years.  How long was the old version of pg_start_backup() the only interface?
Considerablylonger. 

Right, and PG12 will be out for another *5* years beyond that, meaning
people will have had 8.5 years to move from the exclusive API to the
non-exclusive one.

> > Yes, it *is* impossible to do safe backups with the existing API.
>
> That's an overstatement.  "The existing interface has failure conditions that are hard to work around" is true.
Saying"it's impossible to do safe backups" implies that no pre-9.6 system has ever been backed up.  If we take that
line,it's just going to create a "Oh, come *on*" reaction from users. 

Every one of the exclusive backups that was taken wasn't *safe*, and a
crash during any of them (which we've certainly heard plenty about
through various support channels over the years...) would have resulted
in a failure to properly restart.

No, I'm not saying that such backups are corrupted, *that* is an
overstatement, but it isn't actually what I'm saying, just a strawman
you've come up with to argue against.

> > What I'll say
> > is that they'll have over 5 years to adjust those scripts and I don't
> > see an issue with that.
>
> That's not quite the situation.  What they will have is a choice between upgrading, and or their existing scripts
continuingto work.  That choice will start ticking the instant a release without the old interface comes out.  The
back-pressureon upgrading is already very high; this is going to put a lot of installations into a bind, and a lot will
makethe decision not to upgrade because of that. 

That's fine- they can push off upgrading, for as long as 5 years, and
we'll still support them.  That's the model we operate under for
everything, this isn't any different in that regard and I don't have
sympathy for people who insist on saying that *this*, just *this* one
API of *all* the ones we have simply *can't* be changed *ever*.

> > *ALL* of our APIs are declared non-stable between major releases.
> > That's why we have *major* and *minor* releases.
>
> I note that a word game is being played here.  We're calling pg_start_backup() "an API", as if it is the same as,
say,the GiST C API, but it's a SQL level construct.  SQL level constructs are not promises we need to keep forever, of
course,but they're not the same class of promise as a C API. 

I'm not following here, at all...  We're actually better, historically,
about not changing SQL-level constructs than C-level APIs, but we can
and do still change them, regularly, with every major release- including
making new reserved keywords and other very clearly potentially breaking
changes.

> > Sure, it'd be nice if someone would fix it to list out the problems with
> > the exclusive API, but that requires someone wanting to spend time on
> > it.
>
> I'll take a turn at it.

Just to be clear- I don't agree that simply documenting the issues with
it is a sufficient solution to make us keep it.  I'll certainly continue
to advocate for its removal, but it would be nice for people who are
currently using it to be aware of the issues, so I do think it'd be good
to have the documentation improved in that regard, just make sure to not
invalidate the documentation around non-exclusive backup while you're at
it.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Christophe Pettus
Date:

> On Feb 24, 2019, at 13:44, Stephen Frost <sfrost@snowman.net> wrote:
>
> Right, and PG12 will be out for another *5* years beyond that, meaning
> people will have had 8.5 years to move from the exclusive API to the
> non-exclusive one.

The thing is that for 90% of installations, the clock will start ticking when the deprecation.  Until then, most of
themare not going to feel any pressure to make the change.  Most installations have a lot of other things on their
minds. They have the option of not upgrading, and that will be the option a lot of them take. 

> Every one of the exclusive backups that was taken wasn't *safe*, and a
> crash during any of them (which we've certainly heard plenty about
> through various support channels over the years...) would have resulted
> in a failure to properly restart.

Most installations don't routinely encounter this.  I know it happens, and it is definitely a problem, but the field is
notstrewn with corpses here.  The new interface is unquestionably an improvement, but let's not get our messaging amped
upto the point that it hurts our credibility. 

> No, I'm not saying that such backups are corrupted, *that* is an
> overstatement, but it isn't actually what I'm saying, just a strawman
> you've come up with to argue against.

That may not be what you are saying, but statements like "it *is* impossible to do safe backups with the existing API"
willbe heard that way.  Let's keep the messaging we give users accurate. 

> this isn't any different in that regard and I don't have
> sympathy for people who insist on saying that *this*, just *this* one
> API of *all* the ones we have simply *can't* be changed *ever*.

Sure, we can change anything.  It's whether this particular deprecation is a good idea.

> I'm not following here, at all...  We're actually better, historically,
> about not changing SQL-level constructs than C-level APIs

That's my point.  Calling this an "API" makes the change sound more routine than it is.  It's an "API" that a *lot* of
installationsdepend on. 

> I don't agree that simply documenting the issues with
> it is a sufficient solution to make us keep it.

Understood, but I think we need to document it no matter what.
--
-- Christophe Pettus
   xof@thebuild.com



Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Christophe Pettus (xof@thebuild.com) wrote:
> > On Feb 24, 2019, at 13:44, Stephen Frost <sfrost@snowman.net> wrote:
> > Right, and PG12 will be out for another *5* years beyond that, meaning
> > people will have had 8.5 years to move from the exclusive API to the
> > non-exclusive one.
>
> The thing is that for 90% of installations, the clock will start ticking when the deprecation.  Until then, most of
themare not going to feel any pressure to make the change.  Most installations have a lot of other things on their
minds. They have the option of not upgrading, and that will be the option a lot of them take. 

Just to be clear- it was *already* deprecated, we are just talking here
about making that more clear/obvious.

And, again, they're welcome to not upgrade for as long as they'd like,
frankly, but we'll only provide back-patches for 5 years.

> > Every one of the exclusive backups that was taken wasn't *safe*, and a
> > crash during any of them (which we've certainly heard plenty about
> > through various support channels over the years...) would have resulted
> > in a failure to properly restart.
>
> Most installations don't routinely encounter this.  I know it happens, and it is definitely a problem, but the field
isnot strewn with corpses here.  The new interface is unquestionably an improvement, but let's not get our messaging
ampedup to the point that it hurts our credibility. 
>
> > No, I'm not saying that such backups are corrupted, *that* is an
> > overstatement, but it isn't actually what I'm saying, just a strawman
> > you've come up with to argue against.
>
> That may not be what you are saying, but statements like "it *is* impossible to do safe backups with the existing
API"will be heard that way.  Let's keep the messaging we give users accurate. 
>
> > this isn't any different in that regard and I don't have
> > sympathy for people who insist on saying that *this*, just *this* one
> > API of *all* the ones we have simply *can't* be changed *ever*.
>
> Sure, we can change anything.  It's whether this particular deprecation is a good idea.

You say above that the new interface is unquestionably an improvement
and here say that we shouldn't deprecate the old one in favor of it
(even though we actually already have... but that's beside the point I'm
trying to make here), so what you're advocating for is that we keep an
old and known broken interface that we know causes real issues even
after we've developed a new and unquestionably better one.

For my 2c, I think keeping the old interface around at all was a mistake
from the start, just the same as things like pg_user have been kept
around forever because of fear of breaking things, even as we rip out
the recovery.conf file or rename tons of things in the catalog across
major versions.

I really, honestly, believe what we need to *stop* doing is trying to
drag along support for old/deprecated interfaces after we've introduced
new ones, thus avoiding these arguments and the time spent on them, and
the time spent dealing with implementing and documenting new APIs around
old ones.  The only thing it seems to unquestionably do is make us argue
about when we are going to *finally* rip out the old/deprecated API (or
interface, or whatever you want to call it).

This is the new "should we call it PG 10.0 or PG 9.7 this time?"
argument, all over again...

> > I'm not following here, at all...  We're actually better, historically,
> > about not changing SQL-level constructs than C-level APIs
>
> That's my point.  Calling this an "API" makes the change sound more routine than it is.  It's an "API" that a *lot*
ofinstallations depend on. 

A lot of them depended on pg_wal being named pg_xlog too, but we seem to
have managed reasonably well through that, not to mention all the
catalog changes that caused issues for monitoring, etc.

> > I don't agree that simply documenting the issues with
> > it is a sufficient solution to make us keep it.
>
> Understood, but I think we need to document it no matter what.

Sure, go for it.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Andres Freund
Date:
Hi,

On 2019-02-24 17:19:22 -0500, Stephen Frost wrote:
> You say above that the new interface is unquestionably an improvement

FWIW, I think we didn't design it even remotely as well as we ought to
have. It was both a mistake to not offer a version of non-exclusive
backups that works with a client connection (for integration with the
TSMs of this world), and it was a mistake not to offer a commandline
tool that does the non-exclusive pg/start backup.


> I really, honestly, believe what we need to *stop* doing is trying to
> drag along support for old/deprecated interfaces after we've introduced
> new ones, thus avoiding these arguments and the time spent on them, and
> the time spent dealing with implementing and documenting new APIs around
> old ones.  The only thing it seems to unquestionably do is make us argue
> about when we are going to *finally* rip out the old/deprecated API (or
> interface, or whatever you want to call it).

While I agree that we should remove non-exclusive backups, I VEHEMENTLY
oppose the unconditionality of the above statement. Yes, it's annoying
to keep interfaces around, but unnecessarily inflicting pain to everyone
also is annoying.

That's not to say we shouldn't be better at announcing and then
following through on the deprecation of old interfaces.

Greetings,

Andres Freund


Re: Remove Deprecated Exclusive Backup Mode

From
Christophe Pettus
Date:

> On Feb 24, 2019, at 14:19, Stephen Frost <sfrost@snowman.net> wrote:
> You say above that the new interface is unquestionably an improvement
> and here say that we shouldn't deprecate the old one in favor of it
> (even though we actually already have... but that's beside the point I'm
> trying to make here), so what you're advocating for is that we keep an
> old and known broken interface that we know causes real issues even
> after we've developed a new and unquestionably better one.

Yes, I am advocating exactly that.  The reason that I think we need to keep the old one (or, at least, not remove it as
soonas 12) is that creating an obstacle to upgrades is worse than retaining the old one, and it *will* be an obstacle
toupgrades (or to using the community edition at all). 

I do think we need a simple, pg_basebackup-level-complexity hot backup tool that allows a forked copy operation, per
Andres'suggestion. 

There's no single answer to every possible situation like this that pops up; it depends on how widely used the
interfacewas, how reasonable the expectation of permanence is, and the difficulty the users will encounter in changing
tothe new one, along with the complexity of maintaining the documentation and code for the old interface. 

> A lot of them depended on pg_wal being named pg_xlog too, but we seem to
> have managed reasonably well through that, not to mention all the
> catalog changes that caused issues for monitoring, etc.

Some of the incompatible catalog changes (in particular, in pg_stat_activity) I thought were gratuitous, but we did
them,and no point in relitigating that now.  I'd say that the internal layout of PGDATA is fairly weak promise compared
toan SQL-level construct, especially one as widely used as pg_start_backup(). 
--
-- Christophe Pettus
   xof@thebuild.com



Re: Remove Deprecated Exclusive Backup Mode

From
Andres Freund
Date:
Hi,

On 2019-02-24 14:35:04 -0800, Christophe Pettus wrote:
> > On Feb 24, 2019, at 14:19, Stephen Frost <sfrost@snowman.net> wrote:
> > You say above that the new interface is unquestionably an improvement
> > and here say that we shouldn't deprecate the old one in favor of it
> > (even though we actually already have... but that's beside the point I'm
> > trying to make here), so what you're advocating for is that we keep an
> > old and known broken interface that we know causes real issues even
> > after we've developed a new and unquestionably better one.
> 
> Yes, I am advocating exactly that.  The reason that I think we need to
> keep the old one (or, at least, not remove it as soon as 12) is that
> creating an obstacle to upgrades is worse than retaining the old one,
> and it *will* be an obstacle to upgrades (or to using the community
> edition at all).

It sounds to me like you treat this as if having the old method around
had no downsides. But I have seen *numereous* downtimes due to it, and
also corruption triggered by it (people following the hint to remove
backup_label).


> > A lot of them depended on pg_wal being named pg_xlog too, but we seem to
> > have managed reasonably well through that, not to mention all the
> > catalog changes that caused issues for monitoring, etc.
> 
> Some of the incompatible catalog changes (in particular, in
> pg_stat_activity) I thought were gratuitous, but we did them, and no
> point in relitigating that now.  I'd say that the internal layout of
> PGDATA is fairly weak promise compared to an SQL-level construct,
> especially one as widely used as pg_start_backup().

I don't buy that, because you normally specifically should exclude
pg_xlog/pg_wal from basebackups.

Greetings,

Andres Freund


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 2/24/19 11:36 PM, Stephen Frost wrote:
> Greetings,
> 
> * Laurenz Albe (laurenz.albe@cybertec.at) wrote:
>> Stephen Frost wrote:
>>> Yes, it *is* impossible to do safe backups with the existing API.  There
>>> is an unquestionable race condition where a system restart will cause
>>> your system to not come back up without you going in and removing the
>>> backup_label file- and the only way you make that race window small is
>>> to remove the backup_label file right after you run pg_start_backup and
>>> copy it, and then PUT IT BACK at the end before you call pg_stop_backup,
>>> which is insane, but otherwise the 'race window' is the ENTIRE length of
>>> the backup.
>>
>> I just have an idea:
>>
>> What about an option to keep WAL around for the duration of an exclusive backup?
>>
>> That way PostgreSQL can still restart after a crash.  It will take longer than
>> expected, but it will work.  But then, perhaps the long recovery time is only
>> marginally better than having to manually delete the backup_label file...
> 
> I'm afraid that we'd end up with many, many complaints about people
> running out of disk space on WAL when they are trying to take a backup..

This would also require replaying all that WAL during crash recovery 
which could mean a much longer startup time.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:

On 2/25/19 12:35 AM, Christophe Pettus wrote:
> 
> 
>> On Feb 24, 2019, at 14:19, Stephen Frost <sfrost@snowman.net> wrote:
>> You say above that the new interface is unquestionably an improvement
>> and here say that we shouldn't deprecate the old one in favor of it
>> (even though we actually already have... but that's beside the point I'm
>> trying to make here), so what you're advocating for is that we keep an
>> old and known broken interface that we know causes real issues even
>> after we've developed a new and unquestionably better one.
> 
> Yes, I am advocating exactly that.  The reason that I think we need to keep the old one (or, at least, not remove it
assoon as 12)...
 

Exclusive backup will not be removed for PG12.  There wasn't support for 
it so I push it out to PG13.

There does appear to be support for removing it in PG13, though, and I'd 
like to see that done sooner than later.


> I do think we need a simple, pg_basebackup-level-complexity hot backup tool that allows a forked copy operation, per
Andres'suggestion.
 

This is a good idea -- but I'm still a bit mystified why the ability to 
run a backup in shell script is considered to be a hard requirement. 
Just about any language you can name performs non-exclusive backups with 
ease -- except shell.  Perhaps the problem is with the language?

Regards,
-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Adrien NAYRAT
Date:
On 2/22/19 7:10 PM, Robert Haas wrote:
> On Fri, Feb 22, 2019 at 12:35 PM Fujii Masao <masao.fujii@gmail.com> wrote:
>> -1 for the removal. I think that there are still many users of an exclusive
>> backup API, and it's not so easy to migrate their backup scripts so that
>> only non-exclusive one is used because of the reason I wrote in another thread.
>> https://postgr.es/m/CAHGQGwHUkEbkVexVfWNLjmq2rzOS_SHYMiECt+KBn-cBPq5Arg@mail.gmail.com
> 
> Yeah, those are interesting points.  Unfortunately, I think something
> as simple as your proposed...
> 
> psql -c "SELECT pg_start_backup()"
> rsync, cp, tar, storage backup, or something
> psql -c "SELECT pg_stop_backup()"
> 
> ...doesn't have much chance of being correct.  If you aren't checking
> whether pg_start_backup() throws an error, for example, you have got a
> very failure-prone set-up.  That being said, it's possible to fix that
> problem using some shell logic, whereas the problem of keeping a
> connection open for the duration of the backup from the shell is much
> harder.  I recently ran across a case where someone attempted it but
> did not get the logic entirely right, with rather painful
> consequences.
> 
> Now you might say that people should not write their own tools and
> just use professionally produced ones.  But I don't really agree with
> that, and whatever we think people SHOULD do, there are in fact a lot
> of people using their own tools.
> 


FWIW +1 for not remove exclusive backup.

Maintain a connection during the backup is a hard pre-requisite. In my 
previous jobs I already done custom scripts to perform backup by using 
pre/post backup hook to control backup software: With vmware to do PITR 
backup with VM snapshot or with another file backup tool which perform 
block deduplication. I do not see where is the problem if you check 
pg_start_backup()/pg_stop_backup() went well?


It will be annoying if after this removal, companies must change their 
backup strategy by using specific postgres tools (pgbackrest, barman).


Regards,


Re: Remove Deprecated Exclusive Backup Mode

From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
David Steele <david@pgmasters.net> writes:

> On 2/25/19 12:35 AM, Christophe Pettus wrote:
>>
>>
>>> On Feb 24, 2019, at 14:19, Stephen Frost <sfrost@snowman.net> wrote:
>>> You say above that the new interface is unquestionably an improvement
>>> and here say that we shouldn't deprecate the old one in favor of it
>>> (even though we actually already have... but that's beside the point I'm
>>> trying to make here), so what you're advocating for is that we keep an
>>> old and known broken interface that we know causes real issues even
>>> after we've developed a new and unquestionably better one.
>>
>> Yes, I am advocating exactly that.  The reason that I think we need
>> to keep the old one (or, at least, not remove it as soon as 12)...
>
> Exclusive backup will not be removed for PG12.  There wasn't support for
> it so I push it out to PG13.

How about making the deprecation more obvious in PG12, by making
pg_start_backup() in exclusive mode issue a WARNING that it will be
removed in PG13?

We had a similar problem in Perl, where we got pushback for removing
deprecated features because they had been deprecated for so long (in
some cases over 20 years) with no sign of actually being removed.  We
alleviated this by making all deprecation warnings include the version
in which the feature would be removed (at least two major versions after
the one where the versioned warning was introduced).

Having this sort of clarity about what deprecation actually constitutes
is also useful when deciding whether to deprecate or merely discourage a
feature.

My two minor currency units,

- ilmari
-- 
"A disappointingly low fraction of the human race is,
 at any given time, on fire." - Stig Sandbeck Mathisen


Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Adrien NAYRAT (adrien.nayrat@anayrat.info) wrote:
> On 2/22/19 7:10 PM, Robert Haas wrote:
> >On Fri, Feb 22, 2019 at 12:35 PM Fujii Masao <masao.fujii@gmail.com> wrote:
> >>-1 for the removal. I think that there are still many users of an exclusive
> >>backup API, and it's not so easy to migrate their backup scripts so that
> >>only non-exclusive one is used because of the reason I wrote in another thread.
> >>https://postgr.es/m/CAHGQGwHUkEbkVexVfWNLjmq2rzOS_SHYMiECt+KBn-cBPq5Arg@mail.gmail.com
> >
> >Yeah, those are interesting points.  Unfortunately, I think something
> >as simple as your proposed...
> >
> >psql -c "SELECT pg_start_backup()"
> >rsync, cp, tar, storage backup, or something
> >psql -c "SELECT pg_stop_backup()"
> >
> >...doesn't have much chance of being correct.  If you aren't checking
> >whether pg_start_backup() throws an error, for example, you have got a
> >very failure-prone set-up.  That being said, it's possible to fix that
> >problem using some shell logic, whereas the problem of keeping a
> >connection open for the duration of the backup from the shell is much
> >harder.  I recently ran across a case where someone attempted it but
> >did not get the logic entirely right, with rather painful
> >consequences.
> >
> >Now you might say that people should not write their own tools and
> >just use professionally produced ones.  But I don't really agree with
> >that, and whatever we think people SHOULD do, there are in fact a lot
> >of people using their own tools.
>
> FWIW +1 for not remove exclusive backup.
>
> Maintain a connection during the backup is a hard pre-requisite. In my
> previous jobs I already done custom scripts to perform backup by using
> pre/post backup hook to control backup software: With vmware to do PITR
> backup with VM snapshot or with another file backup tool which perform block
> deduplication. I do not see where is the problem if you check
> pg_start_backup()/pg_stop_backup() went well?

This, really, is a large part of the problem- people don't realize and
just simply don't want to believe, it seems at least, that there's any
problem with what they're doing because it works some, or even most, of
the time.

Making a running PG cluster *look* like it was restored from a backup is
really bad and leads to a lot of confusion and cases where PG won't
start up, but that's what the exclusive pg_start_backup/pg_stop_backup
method does and is more-or-less *required* of any solution that wants to
just be able to tar up the data directory as the backup method, which is
why it ends up just being a bad interface overall.

And then there's other issues- things like making sure that you actually
got all of the WAL, without which the backup is inconsistent being the
leading one but there's others.

> It will be annoying if after this removal, companies must change their
> backup strategy by using specific postgres tools (pgbackrest, barman).

You don't write your own database system using CSV files and shell
magic, do you?  I have to say that it continues to boggle my mind that
people insist that *this* part of the system has to be able to be
implementable using shell scripts.

Folks, these are your backups we're talking about, your last resort if
everything else goes up in flames, why do you want to risk that by
implementing your own one-off solution, particularly when there's known
serious issues using that interface, and you want to just use shell
scripts to do it...?

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Dagfinn Ilmari Mannsåker (ilmari@ilmari.org) wrote:
> David Steele <david@pgmasters.net> writes:
> > On 2/25/19 12:35 AM, Christophe Pettus wrote:
> >>> On Feb 24, 2019, at 14:19, Stephen Frost <sfrost@snowman.net> wrote:
> >>> You say above that the new interface is unquestionably an improvement
> >>> and here say that we shouldn't deprecate the old one in favor of it
> >>> (even though we actually already have... but that's beside the point I'm
> >>> trying to make here), so what you're advocating for is that we keep an
> >>> old and known broken interface that we know causes real issues even
> >>> after we've developed a new and unquestionably better one.
> >>
> >> Yes, I am advocating exactly that.  The reason that I think we need
> >> to keep the old one (or, at least, not remove it as soon as 12)...
> >
> > Exclusive backup will not be removed for PG12.  There wasn't support for
> > it so I push it out to PG13.
>
> How about making the deprecation more obvious in PG12, by making
> pg_start_backup() in exclusive mode issue a WARNING that it will be
> removed in PG13?

So..  We've more-or-less come full circle back to where the thread had
left off last time- the plan is to update the documentation to make it
clearer that the exclusive mode is deprecated and that it's going to be
removed, and then we'll actually remove it in PG13.

We could also add such a warning but I'm not entirely convinced it's a
good idea.  There was some earlier discussion about it that I'd probably
want to go back and reread first.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Andres Freund (andres@anarazel.de) wrote:
> On 2019-02-24 17:19:22 -0500, Stephen Frost wrote:
> > You say above that the new interface is unquestionably an improvement
>
> FWIW, I think we didn't design it even remotely as well as we ought to
> have. It was both a mistake to not offer a version of non-exclusive
> backups that works with a client connection (for integration with the
> TSMs of this world), and it was a mistake not to offer a commandline
> tool that does the non-exclusive pg/start backup.

I'm not really following- did you mean to say "without a client
connection" above?

We could still add some kind of commandline tool to help with the
non-exclusive pg start/stop backup but I'm not really sure exactly what
that would look like and I honestly don't have terribly much interest in
spending resources on implementing it since it would lack a great many
features that we'd then end up wanting to add on... and then we end up
backing into building yet another backup tool.

> > I really, honestly, believe what we need to *stop* doing is trying to
> > drag along support for old/deprecated interfaces after we've introduced
> > new ones, thus avoiding these arguments and the time spent on them, and
> > the time spent dealing with implementing and documenting new APIs around
> > old ones.  The only thing it seems to unquestionably do is make us argue
> > about when we are going to *finally* rip out the old/deprecated API (or
> > interface, or whatever you want to call it).
>
> While I agree that we should remove non-exclusive backups, I VEHEMENTLY
> oppose the unconditionality of the above statement. Yes, it's annoying
> to keep interfaces around, but unnecessarily inflicting pain to everyone
> also is annoying.

Alright, then how about we provide a bit of help for everyone who's got
a system built around recovery.conf today, instead of just completely
ripping that out?

If we're happy to do that then I really can't agree with these arguments
that there's things we should try to maintain when it comes to
interfaces, and that's far from the only example of a pretty massive
change that just went into version X with zero notice to users about
what they're using today being deprecated.

> That's not to say we shouldn't be better at announcing and then
> following through on the deprecation of old interfaces.

We announce things have changed every year, and people have five years
from that point, during which we'll continue to support them and fix
bugs and deal with security issues, to make whatever changes they need
to for the new interface.

The argument seems to be that people want new features but they don't
want to have to do any work to get those new features.  Instead, they
expect the community to take on the burden of maintaining those old
interfaces for them, so that they can get those new features.  That
seems like a pretty raw deal for the community and not one that I
really think we should be supporting, and it isn't like we're actually
consistent with it at all- except that we don't break the back-branches
and we do make breaking changes in major versions.

When is something too big of a change to make in a given major version
and we have to, instead, provide backwards compatibility for it?  What
is that policy?  How many releases do we have to support it for?  How do
we communicate that to our users, so that they know what they can depend
on being there across major releases and what they can't?

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Laurenz Albe
Date:
Stephen Frost wrote:
> > It will be annoying if after this removal, companies must change their
> > backup strategy by using specific postgres tools (pgbackrest, barman).
> 
> You don't write your own database system using CSV files and shell
> magic, do you?  I have to say that it continues to boggle my mind that
> people insist that *this* part of the system has to be able to be
> implementable using shell scripts.
> 
> Folks, these are your backups we're talking about, your last resort if
> everything else goes up in flames, why do you want to risk that by
> implementing your own one-off solution, particularly when there's known
> serious issues using that interface, and you want to just use shell
> scripts to do it...?

If we didn't think that simplicity in backup has some value, why does
PostgreSQL provide pg_basebackup?

Not everybody wants to use tools like pgbackrest and barman because it
takes some effort to set them up properly.  It seems that you think that
people who want something simpler are irresponsible.

Incidentally, both barman and pgbackrest stream the backup to a backup server.

I think the most important use case for the low level backup API is when
your database is so large that you want to leverage storage techniques
like snapshots for backup, because you can't or don't want to stream all
those terabytes across the network.

I'm not playing devil's advocate here to annoy you.  I see the problems
with the exclusive backup, and I see how it can hurt people.
I just think that removing exclusive backup without some kind of help
like Andres sketched above will make people unhappy.

Yours,
Laurenz Albe



Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> Stephen Frost wrote:
> > > It will be annoying if after this removal, companies must change their
> > > backup strategy by using specific postgres tools (pgbackrest, barman).
> >
> > You don't write your own database system using CSV files and shell
> > magic, do you?  I have to say that it continues to boggle my mind that
> > people insist that *this* part of the system has to be able to be
> > implementable using shell scripts.
> >
> > Folks, these are your backups we're talking about, your last resort if
> > everything else goes up in flames, why do you want to risk that by
> > implementing your own one-off solution, particularly when there's known
> > serious issues using that interface, and you want to just use shell
> > scripts to do it...?
>
> If we didn't think that simplicity in backup has some value, why does
> PostgreSQL provide pg_basebackup?

I'm all for people using pg_basebackup and encourage them to do so, but,
really, that tool *is* too simple by itself: you need to combine it with
something else that handles data retention, backup validation, etc.

This could be some independent backup solution though, since
pg_basebackup gives you a simple set of files for the backup that can
then be backed up using those other backup tools.  (The same is true of
pgbackrest, btw, and people certainly do use pgbackrest to back up to a
repository and then back that repo up using other tools, but in this
case pgbackrest does have some of those data retention and backup
validation capabilities itself, so it isn't necessary to do that.)

> Not everybody wants to use tools like pgbackrest and barman because it
> takes some effort to set them up properly.  It seems that you think that
> people who want something simpler are irresponsible.

I don't mind simpler (we actually try pretty hard to make pgbackrest
simple to use, in fact...  the simplest config only requires a 4-line
config file), provided that it actually does everything that you
need a backup tool to do.  I do think it's an unfortunate reality that
not enough people take the time to think about what they really need
from a backup solution and spend the time to either find or build a
proper solution.

> Incidentally, both barman and pgbackrest stream the backup to a backup server.

I'm pretty sure both can backup to a local repository too (I know
pgbackrest can), and not just stream to a backup server, so I'm not
really sure what you're getting at here.

> I think the most important use case for the low level backup API is when
> your database is so large that you want to leverage storage techniques
> like snapshots for backup, because you can't or don't want to stream all
> those terabytes across the network.

Snapshots place an awful lot of faith in your storage layer-
particularly when you aren't doing any kind of validation that the
snapshot is in good shape and that there hasn't been any in-place latent
corruption, ever.  If you think about it, a snapshot is actually very
similar to just always doing incremental backups with pgbackrest and
never doing a new full backup or even a check of those old files that
haven't changed to see if they're still valid.  It's not an approach I'd
recommend depending on exclusively.

There's ways to address that, of course, such as building a manifest of
checksums of the files in the data directory, such that you can then
validate that the snapshot is correct, and detect if any corruption ends
up happening (and checksum the manifest as well), though you need to
also track the LSN of the pg_start_backup, so you can know that changes
which happened after that LSN are in the WAL that's captured during the
checksum, and you need a place to put all of this for when you want to
go back and re-validate an old snapshot..  and likely other things that
I'm forgetting at the moment, but it's definitely something we've
thought about and have considered how we might add support to pgbackrest
for working with snapshots (and it wouldn't surprise me if that got
implemented before v13 even..).

If you are happy to place complete faith in snapshots and trust that
they're entirely atomic, you can always forgo dealing with any of this
and just snapshot PGDATA and let PG go through crash recovery when you
restore.

> I'm not playing devil's advocate here to annoy you.  I see the problems
> with the exclusive backup, and I see how it can hurt people.
> I just think that removing exclusive backup without some kind of help
> like Andres sketched above will make people unhappy.

Keeping it will definitely make me unhappy. :)

I'm not against something simple being added to help with the new API,
or with keeping the old API if someone can figure out how to solve for
the known issues with it, just to be clear.  I do worry that the
'simple' thing we add either won't help in a lot of cases or will end up
being much more complex and difficult to use than we're thinking, but we
can discuss that if someone actually implements it, I suppose.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Christophe Pettus
Date:

> On Feb 25, 2019, at 05:14, Stephen Frost <sfrost@snowman.net> wrote:
>
> You don't write your own database system using CSV files and shell
> magic, do you?  I have to say that it continues to boggle my mind that
> people insist that *this* part of the system has to be able to be
> implementable using shell scripts.

It's not complicated why: Backup has to interact with a lot more components of the overall environment than a CSV
export,and those components vary *wildly* from one installation to another, and are often over-contrained by local
policy.

--
-- Christophe Pettus
   xof@thebuild.com



Re: Remove Deprecated Exclusive Backup Mode

From
Christophe Pettus
Date:

> On Feb 25, 2019, at 05:18, Stephen Frost <sfrost@snowman.net> wrote:
> So..  We've more-or-less come full circle back to where the thread had
> left off last time- the plan is to update the documentation to make it
> clearer that the exclusive mode is deprecated and that it's going to be
> removed, and then we'll actually remove it in PG13.

That's not my position, certainly; I still object to its removal.

--
-- Christophe Pettus
   xof@thebuild.com



Re: Remove Deprecated Exclusive Backup Mode

From
Christophe Pettus
Date:

> On Feb 24, 2019, at 22:49, David Steele <david@pgmasters.net> wrote:
> This is a good idea -- but I'm still a bit mystified why the ability to run a backup in shell script is considered to
bea hard requirement. Just about any language you can name performs non-exclusive backups with ease -- except shell.
Perhapsthe problem is with the language? 

Many installations have a high degree of discomfort implementing system management functionality in something other
thanbash.  It can be for any number of reasons: previous bad experiences (*cough* RHEL/Python *cough*), the feeling
thattoo many dependencies are dragged in, or just lack of familiarity with them. 

--
-- Christophe Pettus
   xof@thebuild.com



Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Christophe Pettus (xof@thebuild.com) wrote:
> > On Feb 25, 2019, at 05:18, Stephen Frost <sfrost@snowman.net> wrote:
> > So..  We've more-or-less come full circle back to where the thread had
> > left off last time- the plan is to update the documentation to make it
> > clearer that the exclusive mode is deprecated and that it's going to be
> > removed, and then we'll actually remove it in PG13.
>
> That's not my position, certainly; I still object to its removal.

It wasn't my intent to outline that as your position, my apologies if it
came across that way.

I do believe that's still the plan though, based on both this and the
prior discussion, unless something else happens to address the known
issues.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Sun, Feb 24, 2019 at 3:00 PM Stephen Frost <sfrost@snowman.net> wrote:
> Ok, then please do so, and please be prepared to continue to maintain
> the documentation of both methods moving forward, because others have
> tried and have (rightfully, in my opinion) decided that it's frankly not
> worth the effort and ultimately just terribly confusing for users that
> we have these two different backup methods and even just updating the
> documentation for one or the other is downright painful (to the point
> that people litterally give up on it).  That really isn't a good place
> to be in.

This, to me, is downright rude.  You don't get to tell other people
what to do.  Nor do you get to tell other people "hey, I'm going to
make this change that you don't like unless you agree to do the work
which I specify."  If you want to make a change, you have to build
consensus for that change.  If you can't get people to agree with your
proposed change, what happens is that you don't get to make that
change.  Whether other people choose to do any work that you and they
might happen to agree is valuable is up to them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Mon, Feb 25, 2019 at 1:49 AM David Steele <david@pgmasters.net> wrote:
> Exclusive backup will not be removed for PG12.  There wasn't support for
> it so I push it out to PG13.
>
> There does appear to be support for removing it in PG13, though, and I'd
> like to see that done sooner than later.

Given the additional votes that have appeared since the previous
discussion, I am no longer convinced that there is a consensus for
ripping it out in PG13.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Sun, Feb 24, 2019 at 3:00 PM Stephen Frost <sfrost@snowman.net> wrote:
> > Ok, then please do so, and please be prepared to continue to maintain
> > the documentation of both methods moving forward, because others have
> > tried and have (rightfully, in my opinion) decided that it's frankly not
> > worth the effort and ultimately just terribly confusing for users that
> > we have these two different backup methods and even just updating the
> > documentation for one or the other is downright painful (to the point
> > that people litterally give up on it).  That really isn't a good place
> > to be in.
>
> This, to me, is downright rude.  You don't get to tell other people
> what to do.  Nor do you get to tell other people "hey, I'm going to
> make this change that you don't like unless you agree to do the work
> which I specify."  If you want to make a change, you have to build
> consensus for that change.  If you can't get people to agree with your
> proposed change, what happens is that you don't get to make that
> change.  Whether other people choose to do any work that you and they
> might happen to agree is valuable is up to them.

It wasn't my intent to be rude, and my apologies to you and Christophe
for it coming across that way.

I do think we can ask that people who wish to have a capability included
in PG (or continue to be included when there are serious known issues
with it...) be prepared to either build and maintain it themselves or to
convince someone else to do so (or both, and have a committer agree to
it).  That's how we've long operated and it wasn't my intent to imply
otherwise, but I agree that I could have said it in a nicer way to avoid
it coming across as telling Christophe what to do.

I'm also on board with building a consensus for making a change, but a
consensus does not mean that everyone must agree or be supportive of the
change.  There's also a practical side to things which is that if the
consensus seems to be "we are going to get rid of X" and someone wants
to work on X, we should probably make it pretty clear to them that
there's a consensus to remove it to allow them the option to decide if
they wish to still work on X, or not.  Similairly, I think it's good to
let people know who want to work on Y when Y is really hard, so that
they have some idea what they're getting into.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Mon, Feb 25, 2019 at 8:42 AM Stephen Frost <sfrost@snowman.net> wrote:
> Alright, then how about we provide a bit of help for everyone who's got
> a system built around recovery.conf today, instead of just completely
> ripping that out?
>
> If we're happy to do that then I really can't agree with these arguments
> that there's things we should try to maintain when it comes to
> interfaces, and that's far from the only example of a pretty massive
> change that just went into version X with zero notice to users about
> what they're using today being deprecated.

It's not the same thing.  First of all, the recovery.conf changes have
been under discussion for quite a few years now, whereas this change
has been under discussion for only a few months.  It is also worth
noting that those recovery.conf changes would've been made years ago
if people hadn't objected to breaking backward compatibility; people
have just as much of a right to object to these changes on those
grounds as they did to those changes.  Second, the recovery.conf
changes are intended to provide a tangible benefit - namely, getting
the recovery.conf settings into the GUC system, so that they can be
queried and changed using methods available for other GUCs.  But the
change proposed here breaks things for people who are using the
mechanism in question and it doesn't give them anything else instead.
Indeed, it seems quite obvious that you don't have the LEAST interest
in putting any effort into actually making things in this area better
for users; it appears that you just want to break what they're doing
now and shove the work of coping with those changes onto them.
Christophe is quite right to question whether that will cause users
not to upgrade.  I think it's overtly user-hostile.  You have referred
to the effort of maintaining this code, but you haven't pointed to any
tangible improvement that you'd like to make in this area that is
actually being blocked by the existence of exclusive backup mode.
It's all just griping about how terrible the whole thing is, and it
seems that there are actually quite a few people here who find it less
terrible than you think it is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Fri, Feb 22, 2019 at 6:18 PM Stephen Frost <sfrost@snowman.net> wrote:
> What we need is a backup tool included in core that users feel
> comfortable using instead of trying to write their own.

I agree.  That's a great idea.  Let's talk about how to make that
happen.  Providing a tool that gives people MORE AND BETTER options
than what they have today is likely to make users more happy.
Removing an option that people are currently using, and which they
find better than other available options for reasons with which I
understand that you disagree, will make users more sad.  Happy is
better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Mon, Feb 25, 2019 at 8:42 AM Stephen Frost <sfrost@snowman.net> wrote:
> > Alright, then how about we provide a bit of help for everyone who's got
> > a system built around recovery.conf today, instead of just completely
> > ripping that out?
> >
> > If we're happy to do that then I really can't agree with these arguments
> > that there's things we should try to maintain when it comes to
> > interfaces, and that's far from the only example of a pretty massive
> > change that just went into version X with zero notice to users about
> > what they're using today being deprecated.
>
> It's not the same thing.  First of all, the recovery.conf changes have
> been under discussion for quite a few years now, whereas this change
> has been under discussion for only a few months.

This argument doesn't work for me because it doesn't make *any*
difference to our users- the vast majority of whom don't follow
-hackers.  Our users will simply see that in v12 the system they've
built around recovery.conf won't work and they'll have to learn the new
way and adjust everything to deal with it.

If that argument did matter, we could go back and find the prior
discussions about the issues around the exclusive backup mode and about
removing it, or next year we could point to this thread about it, or the
year after, and say "well, we talked about it a lot, so now let's just
do it", but it all ends up looking the same to our users unless we
actually write into some kind of user-facing documentation or WARNINGs
or similar that something is going away.

> It is also worth
> noting that those recovery.conf changes would've been made years ago
> if people hadn't objected to breaking backward compatibility; people
> have just as much of a right to object to these changes on those
> grounds as they did to those changes.  Second, the recovery.conf
> changes are intended to provide a tangible benefit - namely, getting
> the recovery.conf settings into the GUC system, so that they can be
> queried and changed using methods available for other GUCs.  But the
> change proposed here breaks things for people who are using the
> mechanism in question and it doesn't give them anything else instead.

This isn't a fair argument either because we're having this *after* the
new API was implemented for backup- namely non-exclusive mode, which
*did* give people something better to use instead.

If we went back and added back into the v12 branch support for a
recovery.conf file, then we'd be having exactly this same argument next
year about if we should remove the recovery.conf file support or leave
it in- and you could make the same argument then that removing the
recovery.conf support doesn't give the user anything tangible to replace
it with- because that would have already been done.

> Indeed, it seems quite obvious that you don't have the LEAST interest
> in putting any effort into actually making things in this area better
> for users; it appears that you just want to break what they're doing
> now and shove the work of coping with those changes onto them.

What will make things better for users is actually moving to the new API
that's been available for years now, so that they can avoid the pitfalls
with the old API.

> Christophe is quite right to question whether that will cause users
> not to upgrade.  I think it's overtly user-hostile.  You have referred
> to the effort of maintaining this code, but you haven't pointed to any
> tangible improvement that you'd like to make in this area that is
> actually being blocked by the existence of exclusive backup mode.

I disagree quite a bit with this statement- the existing documentation
is absolutely horrid and needs to be completely ripped out and rewritten
and maintaining the exclusive backup mode in such a rewrite would
absolutely be a *lot* of additional work.

We actually took a shot at trying to improve the documentation while
continuing to cover both the exclusive and the non-exclusive mode and
it's hugely painful.

> It's all just griping about how terrible the whole thing is, and it
> seems that there are actually quite a few people here who find it less
> terrible than you think it is.

I used to be one of those people.  I know that it looks fine and it
certainly seems appealing but having gone through bad experiences with
it, and seen others stumble through those same experiences time and time
again, I've learned that it really is an issue, and one that I would
very much like to avoid causing future users to stumble over.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Mon, Feb 25, 2019 at 11:09 AM Stephen Frost <sfrost@snowman.net> wrote:
> I do think we can ask that people who wish to have a capability included
> in PG (or continue to be included when there are serious known issues
> with it...) be prepared to either build and maintain it themselves or to
> convince someone else to do so (or both, and have a committer agree to
> it).  That's how we've long operated and it wasn't my intent to imply
> otherwise, but I agree that I could have said it in a nicer way to avoid
> it coming across as telling Christophe what to do.

I think it is certainly true that if you want a new capability added,
you have to either write it yourself or get someone to do it for you.
There is SOME precedent for the proposition that if you want an
obsolete capability not to be removed, you should be prepared to help
fix it.  But frankly I think the latter proposition is one we've taken
only occasionally, and only for things that were a whole lot cruftier
and more problematic than this is.  For example, if a library is no
longer available and we have a contrib module that depends on that
library, it's reasonable to say that we can't keep the contrib module
unless somebody rewrites it not to depend on that library any more.
But the current situation is completely different.  The code
maintenance burden resulting from keeping exclusive-mode backups is
mostly hypothetical; the code works as well as it ever did.  We rarely
take the position that we're going to rip out older code that doesn't
work as nicely as newer code does just because nobody's prepared to
e.g. better-document the old code.

For example, we still have FEBE protocol 2 support floating around.
If you're going to talk about badly-designed footguns that ought to be
excised with extreme prejudice, that would IMHO be a far better place
to start than this.  Unlike this code, which it's now obvious is used
by quite a number of people even just among those who read this list
regularly, that code is probably nearly unused and untested and, if we
broke it, we probably wouldn't know.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Fri, Feb 22, 2019 at 6:18 PM Stephen Frost <sfrost@snowman.net> wrote:
> > What we need is a backup tool included in core that users feel
> > comfortable using instead of trying to write their own.
>
> I agree.  That's a great idea.  Let's talk about how to make that
> happen.  Providing a tool that gives people MORE AND BETTER options
> than what they have today is likely to make users more happy.

I've been working for years on making this happen, and am currently
putting quite a large amount of resources into what I hope will be the
final step in the process to get a solution that will have a great many
more and better options than what's available today.  Maybe it'll end up
happening, which I think would be great, maybe it won't, which would be
sad, but at least we'll have tried.

> Removing an option that people are currently using, and which they
> find better than other available options for reasons with which I
> understand that you disagree, will make users more sad.  Happy is
> better.

I don't want to see more users stumbling over the issues with the
exclusive backup interface.  A better interface exists, and has existed
since 9.6.  The exclusive backup method was deprecated in 2016.  One of
the really bad things about the exclusive backup method is that it
*looks* like it works well and that there aren't any issues with it, but
when users hit the issues, they get very sad.  We might have 1000s of
happy users who never run into those issues and therefore don't want to
see us change anything, but what about the 10s or 100s of users who do
hit those issues, what do we tell them?  Seems like we're saying "well,
sorry, we knew those issues existed and it's unfortunate you hit them,
but there's this better thing that you *should* have been using and then
you wouldn't have hit those issues."  That doesn't seem likely to make
people happy with us.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Mon, Feb 25, 2019 at 11:23 AM Stephen Frost <sfrost@snowman.net> wrote:
> If that argument did matter, we could go back and find the prior
> discussions about the issues around the exclusive backup mode and about
> removing it, or next year we could point to this thread about it, or the
> year after, and say "well, we talked about it a lot, so now let's just
> do it", but it all ends up looking the same to our users unless we
> actually write into some kind of user-facing documentation or WARNINGs
> or similar that something is going away.

That's true.  But nobody's objecting to strengthening the warnings in
the documentation.  People are objecting to removing the thing itself.

> This isn't a fair argument either because we're having this *after* the
> new API was implemented for backup- namely non-exclusive mode, which
> *did* give people something better to use instead.

There are several people who are saying that doesn't meet all their
needs, giving reasons why it's problematic, and suggesting things that
could be done to make it less problematic.  It's not OK to say "we can
remove exclusive backup mode because now we have non-exclusive backup
mode" unless other people actually agree that all the use cases are
covered, and it appears that they don't.

> I disagree quite a bit with this statement- the existing documentation
> is absolutely horrid and needs to be completely ripped out and rewritten
> and maintaining the exclusive backup mode in such a rewrite would
> absolutely be a *lot* of additional work.
>
> We actually took a shot at trying to improve the documentation while
> continuing to cover both the exclusive and the non-exclusive mode and
> it's hugely painful.

Well, perhaps that's pain you need to incur.  The alternative seems to
be to rip out something that people don't want ripped out.

> I used to be one of those people.  I know that it looks fine and it
> certainly seems appealing but having gone through bad experiences with
> it, and seen others stumble through those same experiences time and time
> again, I've learned that it really is an issue, and one that I would
> very much like to avoid causing future users to stumble over.

Sure, that sounds great.  But the way to do that is to continue
improving the system until exclusive-mode backups really are not a
useful thing any more, not to remove it while there are still a lot of
people relying on it who can offer tangible explanations for their
choice to do so.

It feels to me like you are portraying the increasing number of people
objecting to this change as naive or foolish or at least not as
enlightened as you are, and I really object to that.  I happen to
think that people like Christophe Pettus and Fujii Masao and Laurenz
Albe are smart people whose opinions ought to be taken as just as
valid as your own.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Mon, Feb 25, 2019 at 11:33 AM Stephen Frost <sfrost@snowman.net> wrote:
> > Removing an option that people are currently using, and which they
> > find better than other available options for reasons with which I
> > understand that you disagree, will make users more sad.  Happy is
> > better.
>
> I don't want to see more users stumbling over the issues with the
> exclusive backup interface.  A better interface exists, and has existed
> since 9.6.  The exclusive backup method was deprecated in 2016.  One of
> the really bad things about the exclusive backup method is that it
> *looks* like it works well and that there aren't any issues with it, but
> when users hit the issues, they get very sad.  We might have 1000s of
> happy users who never run into those issues and therefore don't want to
> see us change anything, but what about the 10s or 100s of users who do
> hit those issues, what do we tell them?  Seems like we're saying "well,
> sorry, we knew those issues existed and it's unfortunate you hit them,
> but there's this better thing that you *should* have been using and then
> you wouldn't have hit those issues."  That doesn't seem likely to make
> people happy with us.

Sure, nobody wants users to keep hitting the same problems over and
over again, but the above is still reductionist.  If we take the view
that anything that can cause data corruption in any scenario, no
matter how remote, is more than everything else, then we should just
stop shipping minor releases and halt all other development work until
we deal with https://commitfest.postgresql.org/22/528/ that has been
open for 14 CommitFests and until we've fully dealt with every aspect
of fsync-gate.  I don't see you treating either of those issues with
the same sense of urgency with which you're treating this one, so
evidently you feel entitled to decide which
potentially-data-corrupting issues you want to attack first.

And I agree that you should have exactly that right.  Along similar
lines, I'd like our users to have the right to decide when they want
to upgrade from using exclusive backup mode to non-exclusive backup
mode, instead of being forced into making a change at a time decided
by our fiat.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Mon, Feb 25, 2019 at 11:23 AM Stephen Frost <sfrost@snowman.net> wrote:
> > If that argument did matter, we could go back and find the prior
> > discussions about the issues around the exclusive backup mode and about
> > removing it, or next year we could point to this thread about it, or the
> > year after, and say "well, we talked about it a lot, so now let's just
> > do it", but it all ends up looking the same to our users unless we
> > actually write into some kind of user-facing documentation or WARNINGs
> > or similar that something is going away.
>
> That's true.  But nobody's objecting to strengthening the warnings in
> the documentation.  People are objecting to removing the thing itself.

Then how long will we carry it forward?  How much warning do we need to
provide?

> > This isn't a fair argument either because we're having this *after* the
> > new API was implemented for backup- namely non-exclusive mode, which
> > *did* give people something better to use instead.
>
> There are several people who are saying that doesn't meet all their
> needs, giving reasons why it's problematic, and suggesting things that
> could be done to make it less problematic.  It's not OK to say "we can
> remove exclusive backup mode because now we have non-exclusive backup
> mode" unless other people actually agree that all the use cases are
> covered, and it appears that they don't.

I haven't heard anyone say it doesn't meet their needs, just that it's
not as easy to use, which is a fair critcism, but not the same thing.
I'm all for making the non-exclusive mode less problematic if we're able
to.  If that's something that would help us move forward with getting
rid of the exclusive backup mode than that's an area that I'd be willing
to put resources.

> > I disagree quite a bit with this statement- the existing documentation
> > is absolutely horrid and needs to be completely ripped out and rewritten
> > and maintaining the exclusive backup mode in such a rewrite would
> > absolutely be a *lot* of additional work.
> >
> > We actually took a shot at trying to improve the documentation while
> > continuing to cover both the exclusive and the non-exclusive mode and
> > it's hugely painful.
>
> Well, perhaps that's pain you need to incur.  The alternative seems to
> be to rip out something that people don't want ripped out.

... now I feel like I'm being told what to do.

The point is that keeping it actually *is* work, it isn't free, not even
in our tree.  Making our users stumble over the issues with it also
isn't free, that's another cost of keeping it.

> > I used to be one of those people.  I know that it looks fine and it
> > certainly seems appealing but having gone through bad experiences with
> > it, and seen others stumble through those same experiences time and time
> > again, I've learned that it really is an issue, and one that I would
> > very much like to avoid causing future users to stumble over.
>
> Sure, that sounds great.  But the way to do that is to continue
> improving the system until exclusive-mode backups really are not a
> useful thing any more, not to remove it while there are still a lot of
> people relying on it who can offer tangible explanations for their
> choice to do so.
>
> It feels to me like you are portraying the increasing number of people
> objecting to this change as naive or foolish or at least not as
> enlightened as you are, and I really object to that.  I happen to
> think that people like Christophe Pettus and Fujii Masao and Laurenz
> Albe are smart people whose opinions ought to be taken as just as
> valid as your own.

I honestly do doubt that they have had the same experiences that I have
had, but that doesn't mean that they're not smart or that I don't value
their opinion- I agree that they're all smart people and I certainly do
value their opinions.  That doesn't mean that I can't disagree with them
or can't explain why I disagree.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Mon, Feb 25, 2019 at 11:33 AM Stephen Frost <sfrost@snowman.net> wrote:
> > > Removing an option that people are currently using, and which they
> > > find better than other available options for reasons with which I
> > > understand that you disagree, will make users more sad.  Happy is
> > > better.
> >
> > I don't want to see more users stumbling over the issues with the
> > exclusive backup interface.  A better interface exists, and has existed
> > since 9.6.  The exclusive backup method was deprecated in 2016.  One of
> > the really bad things about the exclusive backup method is that it
> > *looks* like it works well and that there aren't any issues with it, but
> > when users hit the issues, they get very sad.  We might have 1000s of
> > happy users who never run into those issues and therefore don't want to
> > see us change anything, but what about the 10s or 100s of users who do
> > hit those issues, what do we tell them?  Seems like we're saying "well,
> > sorry, we knew those issues existed and it's unfortunate you hit them,
> > but there's this better thing that you *should* have been using and then
> > you wouldn't have hit those issues."  That doesn't seem likely to make
> > people happy with us.
>
> Sure, nobody wants users to keep hitting the same problems over and
> over again, but the above is still reductionist.  If we take the view
> that anything that can cause data corruption in any scenario, no
> matter how remote, is more than everything else, then we should just
> stop shipping minor releases and halt all other development work until
> we deal with https://commitfest.postgresql.org/22/528/ that has been
> open for 14 CommitFests and until we've fully dealt with every aspect
> of fsync-gate.  I don't see you treating either of those issues with
> the same sense of urgency with which you're treating this one, so
> evidently you feel entitled to decide which
> potentially-data-corrupting issues you want to attack first.

I'm not advocating for removing it today or tomorrow, or in some point
release.  We're already to the stage where we're talking about
something that wouldn't hit until late 2020.  Also, frankly, while there
might be something I could do to help, the whole madness with fsync gate
seems to be getting dealt with by some very smart people who know a
whole lot more about the internals of Linux and FreeBSD and I'd rather
not get in their way.  On the other hand, dealing with backups and
issues around backups has been something that I at least hope I've
gotten decent at.

> And I agree that you should have exactly that right.  Along similar
> lines, I'd like our users to have the right to decide when they want
> to upgrade from using exclusive backup mode to non-exclusive backup
> mode, instead of being forced into making a change at a time decided
> by our fiat.

Our users *have* that option, and have for over two years now, and
it'll have been 4 years by the time v13 comes out.  I agree that the
non-exclusive backup mode isn't as simple as the exclusive backup mode-
to use your example, what we *used* to be doing with fsync() was sure a
lot simpler than what we're going to be doing in the future, but
sometimes people have to make changes, even to more complicated APIs,
to make sure that they're doing things the right way to ensure that
their data doesn't get eaten.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Mon, Feb 25, 2019 at 11:09 AM Stephen Frost <sfrost@snowman.net> wrote:
> > I do think we can ask that people who wish to have a capability included
> > in PG (or continue to be included when there are serious known issues
> > with it...) be prepared to either build and maintain it themselves or to
> > convince someone else to do so (or both, and have a committer agree to
> > it).  That's how we've long operated and it wasn't my intent to imply
> > otherwise, but I agree that I could have said it in a nicer way to avoid
> > it coming across as telling Christophe what to do.
>
> I think it is certainly true that if you want a new capability added,
> you have to either write it yourself or get someone to do it for you.
> There is SOME precedent for the proposition that if you want an
> obsolete capability not to be removed, you should be prepared to help
> fix it.  But frankly I think the latter proposition is one we've taken
> only occasionally, and only for things that were a whole lot cruftier
> and more problematic than this is.  For example, if a library is no
> longer available and we have a contrib module that depends on that
> library, it's reasonable to say that we can't keep the contrib module
> unless somebody rewrites it not to depend on that library any more.
> But the current situation is completely different.  The code
> maintenance burden resulting from keeping exclusive-mode backups is
> mostly hypothetical; the code works as well as it ever did.  We rarely
> take the position that we're going to rip out older code that doesn't
> work as nicely as newer code does just because nobody's prepared to
> e.g. better-document the old code.

I'd argue that if something which has been deprecated for years is
seriously getting in the way of making progress that we should be
willing to accept a proposal to rip it out after giving people
sufficient time to adjust, and that if someone really wants to keep that
deprecated API then they need to be willing to spend the time to address
the reasons why it was deprecated.

> For example, we still have FEBE protocol 2 support floating around.
> If you're going to talk about badly-designed footguns that ought to be
> excised with extreme prejudice, that would IMHO be a far better place
> to start than this.  Unlike this code, which it's now obvious is used
> by quite a number of people even just among those who read this list
> regularly, that code is probably nearly unused and untested and, if we
> broke it, we probably wouldn't know.

You can probably guess my feelings with regard to the FEBE v2 support.

At least, in that case, people aren't really using it though.  What I
really dislike is that we've got a *lot* of people using something that
we *know* has some pretty serious potential data-eating problems.
Saying that we should keep it around because a lot of people are using
it isn't the way to persuade me that we should keep it.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Christophe Pettus
Date:

> On Feb 25, 2019, at 08:55, Stephen Frost <sfrost@snowman.net> wrote:
>
> I honestly do doubt that they have had the same experiences that I have
> had

Well, I guarantee you that no two people on this list have had identical experiences. :)  I certainly have been bitten
bythe problems with the current system.  But the resistance to major version upgrades is *huge*, and I'm strongly
biasedagainst anything that will make that harder.  I'm not sure I'm communicating how big a problem telling many large
installations,"If you move to v12/13/etc., you will have to change your backup system" is going to be. 

--
-- Christophe Pettus
   xof@thebuild.com



Re: Remove Deprecated Exclusive Backup Mode

From
Fujii Masao
Date:
On Mon, Feb 25, 2019 at 10:49 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> Stephen Frost wrote:
> > > It will be annoying if after this removal, companies must change their
> > > backup strategy by using specific postgres tools (pgbackrest, barman).
> >
> > You don't write your own database system using CSV files and shell
> > magic, do you?  I have to say that it continues to boggle my mind that
> > people insist that *this* part of the system has to be able to be
> > implementable using shell scripts.
> >
> > Folks, these are your backups we're talking about, your last resort if
> > everything else goes up in flames, why do you want to risk that by
> > implementing your own one-off solution, particularly when there's known
> > serious issues using that interface, and you want to just use shell
> > scripts to do it...?
>
> If we didn't think that simplicity in backup has some value, why does
> PostgreSQL provide pg_basebackup?
>
> Not everybody wants to use tools like pgbackrest and barman because it
> takes some effort to set them up properly.  It seems that you think that
> people who want something simpler are irresponsible.
>
> Incidentally, both barman and pgbackrest stream the backup to a backup server.
>
> I think the most important use case for the low level backup API is when
> your database is so large that you want to leverage storage techniques
> like snapshots for backup, because you can't or don't want to stream all
> those terabytes across the network.
>
> I'm not playing devil's advocate here to annoy you.  I see the problems
> with the exclusive backup, and I see how it can hurt people.
> I just think that removing exclusive backup without some kind of help
> like Andres sketched above will make people unhappy.

+1

Another idea is to improve an exclusive backup method so that it will never
cause such issue. What about changing an exclusive backup mode of
pg_start_backup() so that it creates something like backup_label.pending file
instead of backup_label? Then if the database cluster has backup_label.pending
file but not recovery.signal (this is the case where the database is recovered
just after the server crashes while an exclusive backup is in progress),
in this idea, the recovery using that database cluster always ignores
(or removes) backup_label.pending file and start replaying WAL from
the REDO location that pg_control file indicates. So this idea enables us to
work around the issue that an exclusive backup could cause.

On the other hand, the downside of this idea is that the users need to change
the recovery procedure. When they want to do PITR using the backup having
backup_label.pending, they need to not only create recovery.signal but also
rename backup_label.pending to backup_label. Rename of backup_label file
is brand-new step for their recovery procedure, and changing the recovery
procedure might be painful for some users. But IMO it's less painful than
removing an exclusive backup API at all.

Thought?

BTW, if recovery.signal is created but backup_label.pending is not renamed
(this is the case where the operator forgets to rename the file even though
she or he create recovery signal file, i.e., mis-configuration), I think that
the recovery should emit PANIC immediately with the HINT like
"HINT: rename backup_label.pening to backup_label if you want to do PITR".

Regards,

-- 
Fujii Masao


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 2/25/19 7:50 PM, Fujii Masao wrote:
> On Mon, Feb 25, 2019 at 10:49 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>>
>> I'm not playing devil's advocate here to annoy you.  I see the problems
>> with the exclusive backup, and I see how it can hurt people.
>> I just think that removing exclusive backup without some kind of help
>> like Andres sketched above will make people unhappy.
> 
> +1
> 
> Another idea is to improve an exclusive backup method so that it will never
> cause such issue. What about changing an exclusive backup mode of
> pg_start_backup() so that it creates something like backup_label.pending file
> instead of backup_label? Then if the database cluster has backup_label.pending
> file but not recovery.signal (this is the case where the database is recovered
> just after the server crashes while an exclusive backup is in progress),
> in this idea, the recovery using that database cluster always ignores
> (or removes) backup_label.pending file and start replaying WAL from
> the REDO location that pg_control file indicates. So this idea enables us to
> work around the issue that an exclusive backup could cause.

It's an interesting idea.

> On the other hand, the downside of this idea is that the users need to change
> the recovery procedure. When they want to do PITR using the backup having
> backup_label.pending, they need to not only create recovery.signal but also
> rename backup_label.pending to backup_label. Rename of backup_label file
> is brand-new step for their recovery procedure, and changing the recovery
> procedure might be painful for some users. But IMO it's less painful than
> removing an exclusive backup API at all.

Well, given that we have invalidated all prior recovery procedures in 
PG12 I'm not sure how big a deal that is.  Of course, it's too late make 
a change like this for PG12.

> Thought?

Here's the really obvious bad thing: if users do not update their 
procedures and we ignore backup_label.pending on startup then they will 
end up with a corrupt database because it will not replay from the 
correct checkpoint.  If we error on the presence of backup_label.pending 
then we are right back to where we started.

I know there are backup solutions that rely on copying all required WAL 
to pg_xlog/pg_wal before starting recovery.  Those solutions would 
silently break in this case and end up in corruption.  If we require 
recovery.signal then we still have the current problem of the cluster 
not starting after a crash.

> BTW, if recovery.signal is created but backup_label.pending is not renamed
> (this is the case where the operator forgets to rename the file even though
> she or he create recovery signal file, i.e., mis-configuration), I think that
> the recovery should emit PANIC immediately with the HINT like
> "HINT: rename backup_label.pening to backup_label if you want to do PITR".

This causes its own problems, as stated above.

Regards,
-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Bruce Momjian
Date:
On Mon, Feb 25, 2019 at 11:33:33AM -0500, Stephen Frost wrote:
> I don't want to see more users stumbling over the issues with the
> exclusive backup interface.  A better interface exists, and has existed
> since 9.6.

Do you really think we would be having this discussion if the
non-exclusive backup method was unequivocally better?  It is better for
some use-cases, and impossible for others.

Also, you can't say it will have no impact for five years on people who
do not upgrade.  The impact will be that they will have no new Postgres
features for five years.

I am not taking a stance on this issue, but making unclear statements
isn't helping the discussion.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
Hi Christophe,

On 2/25/19 7:24 PM, Christophe Pettus wrote:
> 
> 
>> On Feb 25, 2019, at 08:55, Stephen Frost <sfrost@snowman.net> wrote:
>>
>> I honestly do doubt that they have had the same experiences that I have
>> had
> 
> Well, I guarantee you that no two people on this list have had identical experiences. :)  I certainly have been
bittenby the problems with the current system.  But the resistance to major version upgrades is *huge*, and I'm
stronglybiased against anything that will make that harder.  I'm not sure I'm communicating how big a problem telling
manylarge installations, "If you move to v12/13/etc., you will have to change your backup system" is going to be.
 

I honestly think you are underestimating how bad this can be.

The prevailing wisdom is that it's unfortunate that these backup_labels 
get left around but they can be removed with scripting, so no big deal. 
After that the cluster will start.

But -- if you are too aggressive about removing the backup_label and 
accidentally do it before a real restore from backup, then you have a 
corrupt cluster.  Totally silent, but definitely corrupt.  You'll 
probably only see it when you start getting consistency errors from the 
indexes, if ever.  Page checksums won't catch it either unless you are 
*lucky* enough to have a torn page.

Erroneous scripting of this kind can also affect backups that were made 
with the non-exclusive method since the backups look the same.

fsync() is the major corruption issue we are facing right now but that 
doesn't mean there aren't other sources of corruption we should be 
thinking about.  I've thought about this one a lot and it scares me.

I've worked on ways to make it better, but all of them break something 
and involve compromises that are nearly as severe as removing exclusive 
backups entirely.

Regards,
-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Christophe Pettus (xof@thebuild.com) wrote:
> > On Feb 25, 2019, at 08:55, Stephen Frost <sfrost@snowman.net> wrote:
> >
> > I honestly do doubt that they have had the same experiences that I have
> > had
>
> Well, I guarantee you that no two people on this list have had identical experiences. :)

Indeed! :)

> I certainly have been bitten by the problems with the current system.  But the resistance to major version upgrades
is*huge*, and I'm strongly biased against anything that will make that harder.  I'm not sure I'm communicating how big
aproblem telling many large installations, "If you move to v12/13/etc., you will have to change your backup system" is
goingto be. 

Aren't they going to need to make a change for v12 now anyway?

Hopefully they're regularly testing their backups by doing a restore of
them, and dropping a recovery.conf into the directory of a v12 system
after restore will do exactly nothing and they'll get errors complaining
about how they need to provide a restore_command.

That's actually what prompted bringing this painful topic up again-
there's a large change being done to how backup/restore with PG is done
(the recovery.conf file is going away), and people who have written any
kind of automation (or even just documented procedures...) around that
will have to update their systems.  At least from my perspective, making
them have to do such an update once, instead of once now and another
time in the future when we remove exclusive backup (or figure out a way
to do it that's safe and update the instructions for how to do it
right...), is the better option.

* Bruce Momjian (bruce@momjian.us) wrote:
> On Mon, Feb 25, 2019 at 11:33:33AM -0500, Stephen Frost wrote:
> > I don't want to see more users stumbling over the issues with the
> > exclusive backup interface.  A better interface exists, and has existed
> > since 9.6.
>
> Do you really think we would be having this discussion if the
> non-exclusive backup method was unequivocally better?  It is better for
> some use-cases, and impossible for others.

Based on Christophe's comment above, anything which required users to
make a change on upgrade to their backup system would be cause to have
this discussion, which likely includes most possible solutions to the
issues with exclusive backup too, unfortunately..

> Also, you can't say it will have no impact for five years on people who
> do not upgrade.  The impact will be that they will have no new Postgres
> features for five years.

I don't think I made the claim that there wouldn't be any impact for
five years, I said they would continue to have support for five years.

Also, this is exactly what we tell them for any other breaking change
(such as removal of recovery.conf).

> I am not taking a stance on this issue, but making unclear statements
> isn't helping the discussion.

It's not my intent to make unclear statements, so I certainly appreicate
you, and anyone else, pointing out when I do.  I'm happy to clarify.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* David Steele (david@pgmasters.net) wrote:
> On 2/25/19 7:50 PM, Fujii Masao wrote:
> >Thought?
>
> Here's the really obvious bad thing: if users do not update their procedures
> and we ignore backup_label.pending on startup then they will end up with a
> corrupt database because it will not replay from the correct checkpoint.  If
> we error on the presence of backup_label.pending then we are right back to
> where we started.

Right, we definitely can't make a change which would cause non-updated
systems using the same APIs to suddenly end up with corruption.  If we
require a change be made to the scripts that run the process then we
must make it something the user is essentially opt'ing into- where
they've told us explicitly that they're ready to use the new interface,
which is how the non-exclusive backup mode was added.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Bruce Momjian
Date:
On Mon, Feb 25, 2019 at 02:24:18PM -0500, Stephen Frost wrote:
> * Bruce Momjian (bruce@momjian.us) wrote:
> > On Mon, Feb 25, 2019 at 11:33:33AM -0500, Stephen Frost wrote:
> > > I don't want to see more users stumbling over the issues with the
> > > exclusive backup interface.  A better interface exists, and has existed
> > > since 9.6.
> > 
> > Do you really think we would be having this discussion if the
> > non-exclusive backup method was unequivocally better?  It is better for
> > some use-cases, and impossible for others.
> 
> Based on Christophe's comment above, anything which required users to
> make a change on upgrade to their backup system would be cause to have
> this discussion, which likely includes most possible solutions to the
> issues with exclusive backup too, unfortunately..

I am not addressing what Christophe said, but rather what you said.  We
clearly are fine in requiring people to update their software for major
releases.  I think if there was a way for old backup methods to work at
all, this would be a lot simpler converation.

> > Also, you can't say it will have no impact for five years on people who
> > do not upgrade.  The impact will be that they will have no new Postgres
> > features for five years.
> 
> I don't think I made the claim that there wouldn't be any impact for
> five years, I said they would continue to have support for five years.
> 
> Also, this is exactly what we tell them for any other breaking change
> (such as removal of recovery.conf).
> 
> > I am not taking a stance on this issue, but making unclear statements
> > isn't helping the discussion.
> 
> It's not my intent to make unclear statements, so I certainly appreicate
> you, and anyone else, pointing out when I do.  I'm happy to clarify.

Good.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


Re: Remove Deprecated Exclusive Backup Mode

From
Andres Freund
Date:
Hi,

On 2019-02-25 08:14:16 -0500, Stephen Frost wrote:
> > It will be annoying if after this removal, companies must change their
> > backup strategy by using specific postgres tools (pgbackrest, barman).
> 
> You don't write your own database system using CSV files and shell
> magic, do you?  I have to say that it continues to boggle my mind that
> people insist that *this* part of the system has to be able to be
> implementable using shell scripts.
> 
> Folks, these are your backups we're talking about, your last resort if
> everything else goes up in flames, why do you want to risk that by
> implementing your own one-off solution, particularly when there's known
> serious issues using that interface, and you want to just use shell
> scripts to do it...?

FWIW, if you weren't selling backrest quite so hard everywhere backups
are mentioned, I'd find this thread a lot more convicing.

Greetings,

Andres Freund


Re: Remove Deprecated Exclusive Backup Mode

From
Andres Freund
Date:
On 2019-02-25 08:42:02 -0500, Stephen Frost wrote:
> Greetings,
> 
> * Andres Freund (andres@anarazel.de) wrote:
> > On 2019-02-24 17:19:22 -0500, Stephen Frost wrote:
> > > You say above that the new interface is unquestionably an improvement
> > 
> > FWIW, I think we didn't design it even remotely as well as we ought to
> > have. It was both a mistake to not offer a version of non-exclusive
> > backups that works with a client connection (for integration with the
> > TSMs of this world), and it was a mistake not to offer a commandline
> > tool that does the non-exclusive pg/start backup.
> 
> I'm not really following- did you mean to say "without a client
> connection" above?

Yes.


> We could still add some kind of commandline tool to help with the
> non-exclusive pg start/stop backup

That doesn't help, as long as the connection requirement is there. As
explained elsewhere in this thread, a lot of larger external backup
infrastructure just gives you a start/stop hook, and that makes using a
persistent connection hard and fragile.


> but I'm not really sure exactly what that would look like and I
> honestly don't have terribly much interest in spending resources on
> implementing it since it would lack a great many features that we'd
> then end up wanting to add on... and then we end up backing into
> building yet another backup tool.

Meh. You are proposing to remove a feature (even if a dangerous
one). Requiring some minimal compat work to make htat more palpaple
isn't crazy.  Nor does using backrest or whatnot do *ANYTHING* about the
users of large company wide backup tools.


> > > I really, honestly, believe what we need to *stop* doing is trying to
> > > drag along support for old/deprecated interfaces after we've introduced
> > > new ones, thus avoiding these arguments and the time spent on them, and
> > > the time spent dealing with implementing and documenting new APIs around
> > > old ones.  The only thing it seems to unquestionably do is make us argue
> > > about when we are going to *finally* rip out the old/deprecated API (or
> > > interface, or whatever you want to call it).
> > 
> > While I agree that we should remove non-exclusive backups, I VEHEMENTLY
> > oppose the unconditionality of the above statement. Yes, it's annoying
> > to keep interfaces around, but unnecessarily inflicting pain to everyone
> > also is annoying.
> 
> Alright, then how about we provide a bit of help for everyone who's got
> a system built around recovery.conf today, instead of just completely
> ripping that out?

Oh, comeon. Obviously we're sometimes going to have to make breaking
changes. But you're essentially arguing that we shouldn't provide compat
where we can come up with a reasonable way to provide backward
compatibility. And I think that's crazy and will hurt postgres.


> > That's not to say we shouldn't be better at announcing and then
> > following through on the deprecation of old interfaces.
> 
> We announce things have changed every year, and people have five years
> from that point, during which we'll continue to support them and fix
> bugs and deal with security issues, to make whatever changes they need
> to for the new interface.
> 
> The argument seems to be that people want new features but they don't
> want to have to do any work to get those new features.

I mean, duh? We can't make that happen all the time, but I don't
understand why it's surprising to have that as *one* goal that's in
conflict with other goals?  Your analysis also neglects that a lot of
software will have to work across multiple supported versions of
postgres (at the very least in prep for the migration, but also often
longer) - by having a few versions were both interfaces work, that can
be made *MUCH* less painful.


> When is something too big of a change to make in a given major version
> and we have to, instead, provide backwards compatibility for it?  What
> is that policy?  How many releases do we have to support it for?  How do
> we communicate that to our users, so that they know what they can depend
> on being there across major releases and what they can't?

I literally said that we should have more policy around this.


I'm going to stop here, discussing with you in this thread seems
unproductive.


Greetings,

Andres Freund


Re: Remove Deprecated Exclusive Backup Mode

From
Andres Freund
Date:
Hi,

On 2019-02-25 20:43:01 +0200, David Steele wrote:
> fsync() is the major corruption issue we are facing right now but that
> doesn't mean there aren't other sources of corruption we should be thinking
> about.  I've thought about this one a lot and it scares me.

FWIW, I think this kind of issue practially bites a *LOT* more people
than the fsync() issue.


> I've worked on ways to make it better, but all of them break something and
> involve compromises that are nearly as severe as removing exclusive backups
> entirely.

There've been plenty proposals to make non-exclusive backups less
painful in this thread, so ...?

Greetings,

Andres Freund


Re: Remove Deprecated Exclusive Backup Mode

From
Christophe Pettus
Date:

> On Feb 25, 2019, at 11:24, Stephen Frost <sfrost@snowman.net> wrote:
> Aren't they going to need to make a change for v12 now anyway?
>
> Hopefully they're regularly testing their backups by doing a restore of
> them, and dropping a recovery.conf into the directory of a v12 system
> after restore will do exactly nothing and they'll get errors complaining
> about how they need to provide a restore_command.

It varies with the installation, but backups are 100% - epsilon automated, while restores are very frequently done via
runbook. Changing a runbook is institutionally less contentious, and the change is more or less "drop the file into
mumble/conf.drather than mumble", which is less of a break. 

--
-- Christophe Pettus
   xof@thebuild.com



Re: Remove Deprecated Exclusive Backup Mode

From
Andres Freund
Date:
Hi,

On 2019-02-25 09:24:32 -0800, Christophe Pettus wrote:
> But the resistance to major version upgrades is *huge*, and I'm
> strongly biased against anything that will make that harder.  I'm not
> sure I'm communicating how big a problem telling many large
> installations, "If you move to v12/13/etc., you will have to change
> your backup system" is going to be.

I think you might be right about this specific issue. But to me it
sounds like you also don't appreciate that development resources are
really constrained too, and providing endless backward compatibility for
everything is going to use both resources directly, and indirectly by
making the whole system more complex.  I've been on your side of this
fight a couple times (and largely lost), but I think it's important to
appreciate that it's all a balancing, and we all have valid reasons to
keep the balance between development pace / ease of use / ease of
upgrade the way we argue for them. Hard upgrading is going to reduce
adoption, but so is lack of feature development and too many creaky &
redundant & easy to misuse interfaces.

Greetings,

Andres Freund


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
Hi Andres,

On 2/25/19 10:57 PM, Andres Freund wrote:
> Hi,
> 
> On 2019-02-25 08:14:16 -0500, Stephen Frost wrote:
>>> It will be annoying if after this removal, companies must change their
>>> backup strategy by using specific postgres tools (pgbackrest, barman).
>>
>> You don't write your own database system using CSV files and shell
>> magic, do you?  I have to say that it continues to boggle my mind that
>> people insist that *this* part of the system has to be able to be
>> implementable using shell scripts.
>>
>> Folks, these are your backups we're talking about, your last resort if
>> everything else goes up in flames, why do you want to risk that by
>> implementing your own one-off solution, particularly when there's known
>> serious issues using that interface, and you want to just use shell
>> scripts to do it...?
> 
> FWIW, if you weren't selling backrest quite so hard everywhere backups
> are mentioned, I'd find this thread a lot more convicing.

pgBackRest has not used exclusive backups since the new API was 
introduced in 9.6 so this is not an issue for our users.

Over time we have contributed back to Postgres in areas we thought could 
be improved based on our work on the pgBackRest project: 6ad8ac60, 
9fe3c644, 017e4f25, 78874531, 449338cc, 98267ee8, 8694cc96, 920a5e50, 
c37b3d08, 5fc1670b, b981df4c.  This does not include the various backup 
related patches that we have reviewed.

If promoting pgBackRest were our primary concern then it would be in our 
interest to allow Postgres exclusive backups to stay broken and 
pg_basebackup to be as primitive as possible.

Our objections to exclusive backups stem from a desire to do the right 
thing, as we see it, and because we have seen first hand all the ways 
that backups can go wrong.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:
On Mon, Feb 25, 2019 at 10:14 PM Andres Freund <andres@anarazel.de> wrote:
On 2019-02-25 08:42:02 -0500, Stephen Frost wrote:
> Greetings,
>
> * Andres Freund (andres@anarazel.de) wrote:
> > On 2019-02-24 17:19:22 -0500, Stephen Frost wrote:
> > > You say above that the new interface is unquestionably an improvement
> >
> > FWIW, I think we didn't design it even remotely as well as we ought to
> > have. It was both a mistake to not offer a version of non-exclusive
> > backups that works with a client connection (for integration with the
> > TSMs of this world), and it was a mistake not to offer a commandline
> > tool that does the non-exclusive pg/start backup.
>
> I'm not really following- did you mean to say "without a client
> connection" above?

Yes.


> We could still add some kind of commandline tool to help with the
> non-exclusive pg start/stop backup

That doesn't help, as long as the connection requirement is there. As
explained elsewhere in this thread, a lot of larger external backup
infrastructure just gives you a start/stop hook, and that makes using a
persistent connection hard and fragile.

In my experience they don't just give you that. They also give you a module that calls the postgres APIs to do backups.

Some of them are using the old API, some are using the new one. The only thing that would be required there would be to upgrade the module to one that supports the 9.6 APIs. (And you have to upgrade those modules whenever you upgrade oracle or mssql or whatever else you have, so that's not something those organisations are unused to)

I agree with the need for as simpler way to drive it in the simple case or in the say filesystem snapshot case, but I'm not sure the pre/post case is all that common in this context? Some certainly use it today even if they don't have to, but that can probably be easily changed to such a "simplified cmd tool" that you have suggested elsewhere as well.

--

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 2/25/19 11:20 PM, Christophe Pettus wrote:
> 
> 
>> On Feb 25, 2019, at 11:24, Stephen Frost <sfrost@snowman.net> wrote:
>> Aren't they going to need to make a change for v12 now anyway?
>>
>> Hopefully they're regularly testing their backups by doing a restore of
>> them, and dropping a recovery.conf into the directory of a v12 system
>> after restore will do exactly nothing and they'll get errors complaining
>> about how they need to provide a restore_command.
> 
> It varies with the installation, but backups are 100% - epsilon automated, while restores are very frequently done
viarunbook.  Changing a runbook is institutionally less contentious, and the change is more or less "drop the file into
mumble/conf.drather than mumble", which is less of a break.
 

This is true but is becoming far less true over time.  Automation of 
recovery is now very common -- as it should be.  Backups should be 
tested regularly and recovery should not be a scary thing that is done 
at 2am from an outdated runbook.

The recovery.conf change will have serious impact when it arrives in PG12.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 2/25/19 11:38 PM, Andres Freund wrote:
> Hi,
> 
> On 2019-02-25 09:24:32 -0800, Christophe Pettus wrote:
>> But the resistance to major version upgrades is *huge*, and I'm
>> strongly biased against anything that will make that harder.  I'm not
>> sure I'm communicating how big a problem telling many large
>> installations, "If you move to v12/13/etc., you will have to change
>> your backup system" is going to be.
> 
> I think you might be right about this specific issue. But to me it
> sounds like you also don't appreciate that development resources are
> really constrained too, and providing endless backward compatibility for
> everything is going to use both resources directly, and indirectly by
> making the whole system more complex.  I've been on your side of this
> fight a couple times (and largely lost), but I think it's important to
> appreciate that it's all a balancing, and we all have valid reasons to
> keep the balance between development pace / ease of use / ease of
> upgrade the way we argue for them. Hard upgrading is going to reduce
> adoption, but so is lack of feature development and too many creaky &
> redundant & easy to misuse interfaces.

+1.

And those (like me) who are motivated to work in this area are put off 
by the byzantine code and how easy it is to break things -- in large 
part because of the *complete* absence of tests for exclusive mode.

I did a round of doc updates for non/exclusive backups a while back and 
it was excruciating to measure and document the subtle differences in a 
way that a user might possibly understand.  I'm honestly not motivated 
to come back to it until we remove the deprecated interfaces.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Christophe Pettus
Date:

> On Feb 25, 2019, at 13:38, Andres Freund <andres@anarazel.de> wrote:
>
> I think you might be right about this specific issue. But to me it
> sounds like you also don't appreciate that development resources are
> really constrained too, and providing endless backward compatibility for
> everything is going to use both resources directly, and indirectly by
> making the whole system more complex.

One of the things I've tried not to do in this discussion is turn it into a policy debate about backwards compatibility
ingeneral, rather than this very specific feature.  Of course, there's a cost to keeping around features, and I fully
appreciatethat.  In this discussion, the code complexity didn't seem to be the fundamental driver behind removing the
feature;the relative safety of it was, along with maintaining the documentation. 

I jumped in because there seemed to be an argument going on that all of the cool kids will have moved off the old
interfaceand there was essentially no cost to removing it in v12 or v13, and that didn't correspond to my experience. 

--
-- Christophe Pettus
   xof@thebuild.com



Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Andres Freund (andres@anarazel.de) wrote:
> On 2019-02-25 08:14:16 -0500, Stephen Frost wrote:
> > > It will be annoying if after this removal, companies must change their
> > > backup strategy by using specific postgres tools (pgbackrest, barman).
> >
> > You don't write your own database system using CSV files and shell
> > magic, do you?  I have to say that it continues to boggle my mind that
> > people insist that *this* part of the system has to be able to be
> > implementable using shell scripts.
> >
> > Folks, these are your backups we're talking about, your last resort if
> > everything else goes up in flames, why do you want to risk that by
> > implementing your own one-off solution, particularly when there's known
> > serious issues using that interface, and you want to just use shell
> > scripts to do it...?
>
> FWIW, if you weren't selling backrest quite so hard everywhere backups
> are mentioned, I'd find this thread a lot more convicing.

I push pgbackrest, just like I push PG, because I view it as the best
open source tool out there for the job, well architected, designed from
the ground-up to do what a backup tool needs to, and it has a strong
community around it with people contributing back, all openly and under
a permissive license which allows anyone to use it, hack on it, and do
what they want with it- basically, I see it as the PG of backup tools
for PG.  What PG is to relational databases, pgbackrest is to PG backup
tools, imv.

As David mentioned already, this really doesn't change things for
pgbackrest one way or the other- we don't use the old API and haven't
since the non-exclusive API was available.  In addition, pgbackrest
hasn't got a solution for the "we just want to take a snapshot" or "we
just want to use start/stop hooks".  If I was really trying to push
people to pgbackrest because of the flaws in the exclusive backup mode,
I'd surely develop answers to those issues first and make them part of
pgbackrest in short order, and then start yelling from the tops of
buildings about how broken the exclusive backup API is and about how
I've got an easy solution for everyone to move to, maybe do a blog post
on planet.  I'm not doing that though, instead I'm discussing the
issues here, with a tiny fraction of our community, and pushing for us
to agree to get rid of a dangerous API in favor of an API that anyone
can use w/o pgbackrest, if they want to.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 2/25/19 11:59 PM, Christophe Pettus wrote:
> 
> 
>> On Feb 25, 2019, at 13:38, Andres Freund <andres@anarazel.de> wrote:
>>
>> I think you might be right about this specific issue. But to me it
>> sounds like you also don't appreciate that development resources are
>> really constrained too, and providing endless backward compatibility for
>> everything is going to use both resources directly, and indirectly by
>> making the whole system more complex.
> 
> One of the things I've tried not to do in this discussion is turn it into a policy debate about backwards
compatibilityin general, rather than this very specific feature.  Of course, there's a cost to keeping around features,
andI fully appreciate that.  In this discussion, the code complexity didn't seem to be the fundamental driver behind
removingthe feature; the relative safety of it was, along with maintaining the documentation.
 

Code complexity is very much an issue.  Since the two methods are merged 
with a ton of conditionals, working on one without affecting the other 
is a real chore, especially since exclusive backups have *zero* tests. 
Personally I'm not interested in writing those tests so I try not to 
touch the code either.

> I jumped in because there seemed to be an argument going on that all of the cool kids will have moved off the old
interfaceand there was essentially no cost to removing it in v12 or v13, and that didn't correspond to my experience.
 

I don't think anyone believes there will be zero cost, but the issues 
with the exclusive method seem to outweigh the benefits, at least in my 
experience.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Andres Freund (andres@anarazel.de) wrote:
> On 2019-02-25 08:42:02 -0500, Stephen Frost wrote:
> > * Andres Freund (andres@anarazel.de) wrote:
> > > On 2019-02-24 17:19:22 -0500, Stephen Frost wrote:
> > > > You say above that the new interface is unquestionably an improvement
> > >
> > > FWIW, I think we didn't design it even remotely as well as we ought to
> > > have. It was both a mistake to not offer a version of non-exclusive
> > > backups that works with a client connection (for integration with the
> > > TSMs of this world), and it was a mistake not to offer a commandline
> > > tool that does the non-exclusive pg/start backup.
> >
> > I'm not really following- did you mean to say "without a client
> > connection" above?
>
> Yes.

Ah, alright, that makes more sense then.

You might bring that up with the individual who wrote it.

> > We could still add some kind of commandline tool to help with the
> > non-exclusive pg start/stop backup
>
> That doesn't help, as long as the connection requirement is there. As
> explained elsewhere in this thread, a lot of larger external backup
> infrastructure just gives you a start/stop hook, and that makes using a
> persistent connection hard and fragile.

I tend to agree w/ Magnus in that I'm not sure that this is actually
as common these days as claimed.  That said, I'm not against someone
trying to implement something that doesn't require a connection, though
I understand from discussion with David and his comments elsewhere on
this thread that it isn't that simple a thing to do- and people would
have to migrate to whatever it is anyway.

> > but I'm not really sure exactly what that would look like and I
> > honestly don't have terribly much interest in spending resources on
> > implementing it since it would lack a great many features that we'd
> > then end up wanting to add on... and then we end up backing into
> > building yet another backup tool.
>
> Meh. You are proposing to remove a feature (even if a dangerous
> one). Requiring some minimal compat work to make htat more palpaple
> isn't crazy.  Nor does using backrest or whatnot do *ANYTHING* about the
> users of large company wide backup tools.

Yes, I'm supporting the proposal to remove a dangerous feature.  I
really don't think much more needs to be said than that.  If the feature
wasn't dangerous then I wouldn't be argueing for its removal.  I do feel
bad that pgbackrest hasn't got a solution already to this, but we've had
other priorities.  I do think we'll get there, eventually, but not any
time soon.  Being told that I have to implement another feature so that
we can remove the dangerous one is something I don't agree with.

> > > > I really, honestly, believe what we need to *stop* doing is trying to
> > > > drag along support for old/deprecated interfaces after we've introduced
> > > > new ones, thus avoiding these arguments and the time spent on them, and
> > > > the time spent dealing with implementing and documenting new APIs around
> > > > old ones.  The only thing it seems to unquestionably do is make us argue
> > > > about when we are going to *finally* rip out the old/deprecated API (or
> > > > interface, or whatever you want to call it).
> > >
> > > While I agree that we should remove non-exclusive backups, I VEHEMENTLY
> > > oppose the unconditionality of the above statement. Yes, it's annoying
> > > to keep interfaces around, but unnecessarily inflicting pain to everyone
> > > also is annoying.
> >
> > Alright, then how about we provide a bit of help for everyone who's got
> > a system built around recovery.conf today, instead of just completely
> > ripping that out?
>
> Oh, comeon. Obviously we're sometimes going to have to make breaking
> changes. But you're essentially arguing that we shouldn't provide compat
> where we can come up with a reasonable way to provide backward
> compatibility. And I think that's crazy and will hurt postgres.

This was discussed, *extensively*, previously on this list with a whole
lot of proposals trying to come up with easy ways to "fix" exclusive
backups and I don't really think that this thread has proposed anything
novel in that regard.  Basically, I just don't think it's that easy.
Perhaps I'm wrong, in which case, great, I'd be happy to review a simple
patch that solves it.

> > > That's not to say we shouldn't be better at announcing and then
> > > following through on the deprecation of old interfaces.
> >
> > We announce things have changed every year, and people have five years
> > from that point, during which we'll continue to support them and fix
> > bugs and deal with security issues, to make whatever changes they need
> > to for the new interface.
> >
> > The argument seems to be that people want new features but they don't
> > want to have to do any work to get those new features.
>
> I mean, duh? We can't make that happen all the time, but I don't
> understand why it's surprising to have that as *one* goal that's in
> conflict with other goals?  Your analysis also neglects that a lot of
> software will have to work across multiple supported versions of
> postgres (at the very least in prep for the migration, but also often
> longer) - by having a few versions were both interfaces work, that can
> be made *MUCH* less painful.

Having to support multiple APIs like this really is painful.

Also, I'm pretty well versed in software that has to support multiple PG
major versions across time, using specifically these APIs in fact.
Having both interfaces work at the same time didn't make that any less
painful (pgbackrest supported non-exclusive backup in the same version
as 9.6 support was added, and supports backing up PG back to 8.3).
Perhaps it would help some, but I tend to doubt it based both on that
experience and in the experience that there tends to be two times people
make changes- opportunistically when they realize something has been
deprecated, and when they're forced to because the old interface went
away, and the level of effort and time spent tends to be about the same,
all that's different is when it's done.

> > When is something too big of a change to make in a given major version
> > and we have to, instead, provide backwards compatibility for it?  What
> > is that policy?  How many releases do we have to support it for?  How do
> > we communicate that to our users, so that they know what they can depend
> > on being there across major releases and what they can't?
>
> I literally said that we should have more policy around this.

Great.  I was kinda hoping you might propose something by way of
answering those questions so that maybe we could actually get to a point
of having such a policy.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Mon, Feb 25, 2019 at 4:38 PM David Steele <david@pgmasters.net> wrote:
> > FWIW, if you weren't selling backrest quite so hard everywhere backups
> > are mentioned, I'd find this thread a lot more convicing.
>
> pgBackRest has not used exclusive backups since the new API was
> introduced in 9.6 so this is not an issue for our users.
>
> Over time we have contributed back to Postgres in areas we thought could
> be improved based on our work on the pgBackRest project: 6ad8ac60,
> 9fe3c644, 017e4f25, 78874531, 449338cc, 98267ee8, 8694cc96, 920a5e50,
> c37b3d08, 5fc1670b, b981df4c.  This does not include the various backup
> related patches that we have reviewed.
>
> If promoting pgBackRest were our primary concern then it would be in our
> interest to allow Postgres exclusive backups to stay broken and
> pg_basebackup to be as primitive as possible.

Hmm, so what you're saying is that you'd like to disable an API that
some non-backrest users are relying upon but which no backrest users
are relying upon.  And you don't understand why some non-backrest
users are opposed to that plan.  Is that a correct summary of your
position?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Robert,

* Robert Haas (robertmhaas@gmail.com) wrote:
> Hmm, so what you're saying is that you'd like to disable an API that
> some non-backrest users are relying upon but which no backrest users
> are relying upon.  And you don't understand why some non-backrest
> users are opposed to that plan.  Is that a correct summary of your
> position?

This topic has come up more than once and on multiple threads (entirely
independent of pgbackrest and any of the other individuals on this
thread) and I think it needs to be a discussion held at the PGCon
Developer meeting.

I'll propose it.

If you'd like to propose something to fix the, as well discussed,
dangerous exclusive backup method, or further litigate at what point a
dangerous and misleading interface should be removed, I'm sure you'll
find we'd be happy to talk about that.

Thanks,

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Mark Kirkwood
Date:
On 26/02/19 4:53 PM, Robert Haas wrote:
> On Mon, Feb 25, 2019 at 4:38 PM David Steele <david@pgmasters.net> wrote:
>>> FWIW, if you weren't selling backrest quite so hard everywhere backups
>>> are mentioned, I'd find this thread a lot more convicing.
>> pgBackRest has not used exclusive backups since the new API was
>> introduced in 9.6 so this is not an issue for our users.
>>
>> Over time we have contributed back to Postgres in areas we thought could
>> be improved based on our work on the pgBackRest project: 6ad8ac60,
>> 9fe3c644, 017e4f25, 78874531, 449338cc, 98267ee8, 8694cc96, 920a5e50,
>> c37b3d08, 5fc1670b, b981df4c.  This does not include the various backup
>> related patches that we have reviewed.
>>
>> If promoting pgBackRest were our primary concern then it would be in our
>> interest to allow Postgres exclusive backups to stay broken and
>> pg_basebackup to be as primitive as possible.
> Hmm, so what you're saying is that you'd like to disable an API that
> some non-backrest users are relying upon but which no backrest users
> are relying upon.  And you don't understand why some non-backrest
> users are opposed to that plan.  Is that a correct summary of your
> position?
>

+1 to Robert's Management Summary.

ISTM that the onus should be on the patch submitter to provide additions 
to pg_basebackup that make it as painless as possible for those people 
*not* using pgBackRest to continue making backups. Breaking this is just 
not right. Submitting patches that mean that people *must* use 
pgBackRest is also not right IMHO.

Finally, Open Source is about is working together to make the a common 
project (in this case Poistgres) better - not forcing us to use 
something else you have written (even if it is good).

regards

Mark



Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings Mark,

* Mark Kirkwood (mark.kirkwood@catalyst.net.nz) wrote:
> ISTM that the onus should be on the patch submitter to provide additions to
> pg_basebackup that make it as painless as possible for those people *not*
> using pgBackRest to continue making backups. Breaking this is just not
> right. Submitting patches that mean that people *must* use pgBackRest is
> also not right IMHO.

I'm sorry that there's some confusion here- to be clear, no one is
required to use pgBackRest.  pg_basebackup works quite well and wouldn't
be impacted by the changes proposed no this thread.  The arguments
against removing the exclusive backup feature don't have anything to do
with pg_basebackup.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Michael Paquier
Date:
On Mon, Feb 25, 2019 at 08:17:27PM +0200, David Steele wrote:
> Here's the really obvious bad thing: if users do not update their procedures
> and we ignore backup_label.pending on startup then they will end up with a
> corrupt database because it will not replay from the correct checkpoint.  If
> we error on the presence of backup_label.pending then we are right back to
> where we started.

Not really.  If we error on backup_label.pending, we can make the
difference between a backend which has crashed in the middle of an
exclusive backup without replaying anything and a backend which is
started based on a base backup, so an operator can take some action to
see what's wrong with the server.  If you issue an error, users can
also see that their custom backup script is wrong because they forgot
to rename the flag after taking a backup of the data folder(s).

The real pain point about the non-exclusive APIs is really that we
need to keep the connection opened, which can be done easily using
drivers like psycopg2 or such, still that's far from straight-forward
for the average enterprise user, and it can be hard for some companies
to complicate a software stack with more external dependencies.  So
yes, while I can see the long-term benefits of non-exclusive backups,
they are much more complicated to use than exclusive backups.
--
Michael

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Mark Kirkwood
Date:
On 26/02/19 5:41 PM, Stephen Frost wrote:
> Greetings Mark,
>
> * Mark Kirkwood (mark.kirkwood@catalyst.net.nz) wrote:
>> ISTM that the onus should be on the patch submitter to provide additions to
>> pg_basebackup that make it as painless as possible for those people *not*
>> using pgBackRest to continue making backups. Breaking this is just not
>> right. Submitting patches that mean that people *must* use pgBackRest is
>> also not right IMHO.
> I'm sorry that there's some confusion here- to be clear, no one is
> required to use pgBackRest.  pg_basebackup works quite well and wouldn't
> be impacted by the changes proposed no this thread.  The arguments
> against removing the exclusive backup feature don't have anything to do
> with pg_basebackup.
>
Ah yes (checks pg_basbackup code), you are correct! Reading this thread 
I thought I saw a comment to the effect that pg_basebackup was being 
broken, hence the less than impressed post.

Your relentless bashing of people doing their own backups and heavy 
marketing of pgBackRest - unfortunately - made it easy for me to believe 
that this was a possibility that you might see as ok. So - apologies for 
the misunderstanding, however less marketing of your own product would 
avoid me jumping to the wrong conclusion.

regards

Mark



Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 2/26/19 6:51 AM, Michael Paquier wrote:
> On Mon, Feb 25, 2019 at 08:17:27PM +0200, David Steele wrote:
>> Here's the really obvious bad thing: if users do not update their procedures
>> and we ignore backup_label.pending on startup then they will end up with a
>> corrupt database because it will not replay from the correct checkpoint.  If
>> we error on the presence of backup_label.pending then we are right back to
>> where we started.
> 
> Not really.  If we error on backup_label.pending, we can make the
> difference between a backend which has crashed in the middle of an
> exclusive backup without replaying anything and a backend which is
> started based on a base backup, so an operator can take some action to
> see what's wrong with the server.  If you issue an error, users can
> also see that their custom backup script is wrong because they forgot
> to rename the flag after taking a backup of the data folder(s).

The operator still has a decision to make, manually, just as they do 
now.  The wrong decision may mean a corrupt database.

Here's the scenario:

1) They do a restore, forget to rename backup_label.pending.
2) Postgres won't start, which is the same action we take now.
3) The user is not sure what to do, rename or delete?  They delete, and 
the cluster is corrupted.

Worse, they have scripted the deletion of backup_label so that the 
cluster will restart on crash.  This is the recommendation from our 
documentation after all.  If that script runs after a restore instead of 
a crash, then the cluster will be corrupt -- silently.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Laurenz Albe
Date:
Fujii Masao wrote:
> Another idea is to improve an exclusive backup method so that it will never
> cause such issue. What about changing an exclusive backup mode of
> pg_start_backup() so that it creates something like backup_label.pending file
> instead of backup_label? Then if the database cluster has backup_label.pending
> file but not recovery.signal (this is the case where the database is recovered
> just after the server crashes while an exclusive backup is in progress),
> in this idea, the recovery using that database cluster always ignores
> (or removes) backup_label.pending file and start replaying WAL from
> the REDO location that pg_control file indicates. So this idea enables us to
> work around the issue that an exclusive backup could cause.

Then if you restore a backup, but forget to add the recovery.signal file,
PostgreSQL will happily recover from a wrong checkpoint and you end up with
a corrupted database.

I think the fundamental problem with all these approaches is that there is
no safe way to distinguish a server crashed in backup mode from a restored
backup.  This is what makes the problem so hard.

The existing exclusive backup is in my opinion the safest variant: it refuses
to create a corrupted cluster without manual intervention and gives you a dire
warning to consider if you are doing the right thing.

Yours,
Laurenz Albe



Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Mark Kirkwood (mark.kirkwood@catalyst.net.nz) wrote:
> On 26/02/19 5:41 PM, Stephen Frost wrote:
> >* Mark Kirkwood (mark.kirkwood@catalyst.net.nz) wrote:
> >>ISTM that the onus should be on the patch submitter to provide additions to
> >>pg_basebackup that make it as painless as possible for those people *not*
> >>using pgBackRest to continue making backups. Breaking this is just not
> >>right. Submitting patches that mean that people *must* use pgBackRest is
> >>also not right IMHO.
> >I'm sorry that there's some confusion here- to be clear, no one is
> >required to use pgBackRest.  pg_basebackup works quite well and wouldn't
> >be impacted by the changes proposed no this thread.  The arguments
> >against removing the exclusive backup feature don't have anything to do
> >with pg_basebackup.
>
> Ah yes (checks pg_basbackup code), you are correct! Reading this thread I
> thought I saw a comment to the effect that pg_basebackup was being broken,
> hence the less than impressed post.

No, I don't see breaking pg_basebackup as ok, but I do have concerns
about how it doesn't provide for any validation of the backup unless you
use tar+gzip.  Unlike with exclusive mode, at least you don't run the
risk with pg_basebackup that a crash will mean that PG won't come back
up without intervention.  I can understand how the comments which you
were responding to might have lead to that confusion.

> Your relentless bashing of people doing their own backups and heavy
> marketing of pgBackRest - unfortunately - made it easy for me to believe
> that this was a possibility that you might see as ok. So - apologies for the
> misunderstanding, however less marketing of your own product would avoid me
> jumping to the wrong conclusion.

As for my personal recommendations, if you go back far enough, you can
see where I used to discuss how to use the exclusive API for doing
backups with people, because maybe they didn't like the tools available
at the time (I didn't, after all), but after seeing how many issues come
from doing that and realizing how bad it is that things like our
documented archive_command doesn't ensure that WAL is actually written
out to disk, I realized why dedicated tools had been written and started
to recommend that people pick one of the dedicated tools instead of
writing their own (and I used to recommend the other solution like
barman right alongside pgBackRest, and I still do make the generic
recommendation at times that people not try to build their own backup
tool) but then after seeing things like CIFS mounts happily throwing
kernel errors while returning success on (obviously, actually failed)
fsync() calls, resulting in corrupted backups that would silently be
corrupt if restored from, I realized that having an actual manifest of
all files backed up and their checksums wasn't a "nice to have" kind of
feature for a backup tool but something critical to a backup solution
(thankfully, David knew that right from the start and so it was always
part of pgBackRest) and so I have a pretty hard time recommending
solutions that don't have that these days.

As for back to the main topic of this thread, do I hope that removing
the exclusive backup mode will cause people to look at and consider
existing, purpose-build, backup solutions for PG, rather than trying to
update their one-off backup shell scripts?  Yes, of course I do.  Am I
trying to sneak something in that's going to force them to use
pgBackRest?  No, of course not.  Note that pg_basebackup, barman and
pgBackRest have had support for the non-exclusive API since it was
introduced (or, for barman, even before then if one installed the
pgespresso extension, and pg_basebackup since it was written because it
uses the replication protocol), and so does WAL-G (though I'm not sure
exactly when it got it, or maybe it always had it), and pg_probackup
(which appears to have gotten it in May 2017, less than a year after 9.6
was released), and even the shell-based pitery (which got it at some
point before pg_probackup, it looks like, based on the commits).

Considering there's at least *5* other backup tools which already
support the non-exclusive API, all of which have done so since the
non-exclusive mode was introduced or shortly after, I really have to
question the angle of attack being used here which is claiming that
this thread is motivated by a goal of forcing people to pgBackRest.

The main group who would be impacted by the exclusive mode going away
are those who aren't using an existing solution and who either didn't
get the memo about exclusive mode being deprecated, or decided just to
ignore it, and those are the installations which really do concern me
the most because they're the highest risk for people ending up losing
their data.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> I think the fundamental problem with all these approaches is that there is
> no safe way to distinguish a server crashed in backup mode from a restored
> backup.  This is what makes the problem so hard.

Right- if you want to just call start/stop and take a snapshot in the
middle and then be able to restore that directly and start up the
database, then there *can't* be any way to distinguish between the two,
which is, I'm pretty sure, where this whole discussion ended up back
during the 9.6 development cycle and why it's still an issue.  If there
was an easy way to fix this, I feel like we would have already.

> The existing exclusive backup is in my opinion the safest variant: it refuses
> to create a corrupted cluster without manual intervention and gives you a dire
> warning to consider if you are doing the right thing.

... it's the least dangerous if you limit yourself to that method, but
that doesn't make it safe. :(

In the end, you basically *have* to have a way of extracting out the
data needed for the backup (start/stop WAL and such) that doesn't make
the running cluster look like it's a backup being restored, and you
*have* to make that information available to the database cluster when
it's restored somehow, and notify PG that it's doing backup recovery and
*not* crash recovery, to eliminate this risk, and that's pretty hard to
manage if all you want to do is snapshot the filesystem.

Of course, you have to have a solution for WAL too and the thought has
crossed my mind that maybe there's something we could do when it comes
to stash all the info needed in the WAL archive, but I'm still not sure
how we'd solve for knowing if we're doing backup recovery or crash
recovery in that case without some kind of marker or something external
telling us that's what we're doing.  As you proposed previously, but
with a bit of a twist, maybe we could just always do backup recovery if
we find a .backup (or whatever) file in the WAL that, when compared to
pg_control, shows that we were in the process of doing a backup...  That
would require that everyone always have a restore_command set, which
wasn't possible before because that went into recovery.conf, but it's
possible to just always have that set now, and that would eliminate the
risk of us running the system out of disk space by keeping all the WAL
that's generated during the backup local.

Obviously, a lot of this is pretty hand-wavy, and you still have the
unfortunate situation that if you're actually recoverying from a crash
that just happened to happen while you were taking a backup then you
could be replaying a heck of a lot more WAL than you needed to, and you
have to have a working restore_command on the primary, and you'd have to
figure out a way for PG to check for these files .backup or whatever
files on startup that doesn't take forever or require stepping through
every WAL segment or something, but maybe those concerns could be
addressed.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Tue, Feb 26, 2019 at 1:49 AM David Steele <david@pgmasters.net> wrote:
> The operator still has a decision to make, manually, just as they do
> now.  The wrong decision may mean a corrupt database.
>
> Here's the scenario:
>
> 1) They do a restore, forget to rename backup_label.pending.
> 2) Postgres won't start, which is the same action we take now.
> 3) The user is not sure what to do, rename or delete?  They delete, and
> the cluster is corrupted.
>
> Worse, they have scripted the deletion of backup_label so that the
> cluster will restart on crash.  This is the recommendation from our
> documentation after all.  If that script runs after a restore instead of
> a crash, then the cluster will be corrupt -- silently.

Sure, that's true.  On the other hand, it's not like someone can't
manage to use the non-exclusive mode and fail to drop the backup_label
and tablespace_map files into the right places.  This procedure is
tedious and error-prone no matter which way you do it, which is one
reason I don't believe that the exclusive backup method is nearly as
bad as you're making it out to be.

Mind you, I'm not really defending the proposal to add a new backup
mode along the likes Fujii Masao is proposing.  I don't think the
situation will be made less error-prone by having three ways to do it
instead of two.  But I think we probably would've been happier if we'd
designed it the way he's now proposing in the first place, instead of
the way we actually did.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Fujii Masao
Date:
On Tue, Feb 26, 2019 at 3:17 AM David Steele <david@pgmasters.net> wrote:
>
> On 2/25/19 7:50 PM, Fujii Masao wrote:
> > On Mon, Feb 25, 2019 at 10:49 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> >>
> >> I'm not playing devil's advocate here to annoy you.  I see the problems
> >> with the exclusive backup, and I see how it can hurt people.
> >> I just think that removing exclusive backup without some kind of help
> >> like Andres sketched above will make people unhappy.
> >
> > +1
> >
> > Another idea is to improve an exclusive backup method so that it will never
> > cause such issue. What about changing an exclusive backup mode of
> > pg_start_backup() so that it creates something like backup_label.pending file
> > instead of backup_label? Then if the database cluster has backup_label.pending
> > file but not recovery.signal (this is the case where the database is recovered
> > just after the server crashes while an exclusive backup is in progress),
> > in this idea, the recovery using that database cluster always ignores
> > (or removes) backup_label.pending file and start replaying WAL from
> > the REDO location that pg_control file indicates. So this idea enables us to
> > work around the issue that an exclusive backup could cause.
>
> It's an interesting idea.
>
> > On the other hand, the downside of this idea is that the users need to change
> > the recovery procedure. When they want to do PITR using the backup having
> > backup_label.pending, they need to not only create recovery.signal but also
> > rename backup_label.pending to backup_label. Rename of backup_label file
> > is brand-new step for their recovery procedure, and changing the recovery
> > procedure might be painful for some users. But IMO it's less painful than
> > removing an exclusive backup API at all.
>
> Well, given that we have invalidated all prior recovery procedures in
> PG12 I'm not sure how big a deal that is.  Of course, it's too late make
> a change like this for PG12.
>
> > Thought?
>
> Here's the really obvious bad thing: if users do not update their
> procedures and we ignore backup_label.pending on startup then they will
> end up with a corrupt database because it will not replay from the
> correct checkpoint.  If we error on the presence of backup_label.pending
> then we are right back to where we started.

No. In this case, since backup_label.pending and recovery.signal exist,
as I described in my previous post, the server stops the recovery with
PANIC error before corrupting the database. Then the operator can
rename backup_label.pending to backup_label and restart the recovery
safely.

So, let me clarify the situations;

(1) If backup_label and recovery.signal exist, the recovery starts safely.
       This is the normal case of recovery from the base backup.

(2)If backup_label.pending and recovery.signal exist, as described above,
       PANIC error happens at the start of recovery. This case can happen
       if the operator forgets to rename backup_label.pending, i.e.,
       operation mistake. So, after PANIC, the operator needs to fix her or
       his mistake (i.e., rename backup_label.pending) and restart
       the recovery.

(3) If backup_label.pending exists but recovery.signal doesn't, the server
       ignores (or removes) backup_label.pending and do the recovery
       starting the pg_control's REDO location. This case can happen if
       the server crashes while an exclusive backup is in progress.
       So crash-in-the-middle-of-backup doesn't prevent the recovery from
       starting in this idea.

Regards,

-- 
Fujii Masao


Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Tue, Feb 26, 2019 at 12:20 PM Fujii Masao <masao.fujii@gmail.com> wrote:
> So, let me clarify the situations;
>
> (1) If backup_label and recovery.signal exist, the recovery starts safely.
>        This is the normal case of recovery from the base backup.
>
> (2)If backup_label.pending and recovery.signal exist, as described above,
>        PANIC error happens at the start of recovery. This case can happen
>        if the operator forgets to rename backup_label.pending, i.e.,
>        operation mistake. So, after PANIC, the operator needs to fix her or
>        his mistake (i.e., rename backup_label.pending) and restart
>        the recovery.
>
> (3) If backup_label.pending exists but recovery.signal doesn't, the server
>        ignores (or removes) backup_label.pending and do the recovery
>        starting the pg_control's REDO location. This case can happen if
>        the server crashes while an exclusive backup is in progress.
>        So crash-in-the-middle-of-backup doesn't prevent the recovery from
>        starting in this idea.

The if-conditions for 1 and 2 appear to be the same, which is confusing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Christophe Pettus
Date:

> On Feb 26, 2019, at 09:26, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Feb 26, 2019 at 12:20 PM Fujii Masao <masao.fujii@gmail.com> wrote:
>> So, let me clarify the situations;
>>
>> (1) If backup_label and recovery.signal exist, the recovery starts safely.
>>       This is the normal case of recovery from the base backup.
>>
>> (2)If backup_label.pending and recovery.signal exist, as described above,
>>       PANIC error happens at the start of recovery. This case can happen
>>       if the operator forgets to rename backup_label.pending, i.e.,
>>       operation mistake. So, after PANIC, the operator needs to fix her or
>>       his mistake (i.e., rename backup_label.pending) and restart
>>       the recovery.
>>
>> (3) If backup_label.pending exists but recovery.signal doesn't, the server
>>       ignores (or removes) backup_label.pending and do the recovery
>>       starting the pg_control's REDO location. This case can happen if
>>       the server crashes while an exclusive backup is in progress.
>>       So crash-in-the-middle-of-backup doesn't prevent the recovery from
>>       starting in this idea.
>
> The if-conditions for 1 and 2 appear to be the same, which is confusing.

I believe #1 is when backup_label (no .pending) exists, #2 is when backup_label.pending (with .pending) exists.

At the absolute minimum, this discussion has convinced me that we need to create a wiki page to accurately describe the
failurescenarios for both exclusive and non-exclusive backups, and the recovery actions for them.  If it exists
already,my search attempts weren't successful.  If it doesn't, I'm happy to start one. 
--
-- Christophe Pettus
   xof@thebuild.com



Re: Remove Deprecated Exclusive Backup Mode

From
Fujii Masao
Date:
On Wed, Feb 27, 2019 at 2:27 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Feb 26, 2019 at 12:20 PM Fujii Masao <masao.fujii@gmail.com> wrote:
> > So, let me clarify the situations;
> >
> > (1) If backup_label and recovery.signal exist, the recovery starts safely.
> >        This is the normal case of recovery from the base backup.
> >
> > (2)If backup_label.pending and recovery.signal exist, as described above,
> >        PANIC error happens at the start of recovery. This case can happen
> >        if the operator forgets to rename backup_label.pending, i.e.,
> >        operation mistake. So, after PANIC, the operator needs to fix her or
> >        his mistake (i.e., rename backup_label.pending) and restart
> >        the recovery.
> >
> > (3) If backup_label.pending exists but recovery.signal doesn't, the server
> >        ignores (or removes) backup_label.pending and do the recovery
> >        starting the pg_control's REDO location. This case can happen if
> >        the server crashes while an exclusive backup is in progress.
> >        So crash-in-the-middle-of-backup doesn't prevent the recovery from
> >        starting in this idea.
>
> The if-conditions for 1 and 2 appear to be the same, which is confusing.

Using other name like backup_in_progress instead of backup_label.pending
may help a bit for your concern? That is,

(1) backup_label and recovery.signal
(2) backup_in_progress and recovery.signal

Regards,

-- 
Fujii Masao


Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Tue, Feb 26, 2019 at 12:31 PM Christophe Pettus <xof@thebuild.com> wrote:
> I believe #1 is when backup_label (no .pending) exists, #2 is when backup_label.pending (with .pending) exists.

Oh, whoops.  I get it now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Remove Deprecated Exclusive Backup Mode

From
Fujii Masao
Date:
On Tue, Feb 26, 2019 at 3:49 PM David Steele <david@pgmasters.net> wrote:
>
> On 2/26/19 6:51 AM, Michael Paquier wrote:
> > On Mon, Feb 25, 2019 at 08:17:27PM +0200, David Steele wrote:
> >> Here's the really obvious bad thing: if users do not update their procedures
> >> and we ignore backup_label.pending on startup then they will end up with a
> >> corrupt database because it will not replay from the correct checkpoint.  If
> >> we error on the presence of backup_label.pending then we are right back to
> >> where we started.
> >
> > Not really.  If we error on backup_label.pending, we can make the
> > difference between a backend which has crashed in the middle of an
> > exclusive backup without replaying anything and a backend which is
> > started based on a base backup, so an operator can take some action to
> > see what's wrong with the server.  If you issue an error, users can
> > also see that their custom backup script is wrong because they forgot
> > to rename the flag after taking a backup of the data folder(s).
>
> The operator still has a decision to make, manually, just as they do
> now.  The wrong decision may mean a corrupt database.

Even in non-exclusive backup mode, the wrong decision may mean
a corrupt database. For example, what if the user may forget to save
the backup_label in the backup taken by using non-exclusive backup
method? So I'm not sure if this is the matter only for an exclusive backup.

Regards,

-- 
Fujii Masao


Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:
On Tue, Feb 26, 2019 at 6:31 PM Christophe Pettus <xof@thebuild.com> wrote:


> On Feb 26, 2019, at 09:26, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Feb 26, 2019 at 12:20 PM Fujii Masao <masao.fujii@gmail.com> wrote:
>> So, let me clarify the situations;
>>
>> (1) If backup_label and recovery.signal exist, the recovery starts safely.
>>       This is the normal case of recovery from the base backup.
>>
>> (2)If backup_label.pending and recovery.signal exist, as described above,
>>       PANIC error happens at the start of recovery. This case can happen
>>       if the operator forgets to rename backup_label.pending, i.e.,
>>       operation mistake. So, after PANIC, the operator needs to fix her or
>>       his mistake (i.e., rename backup_label.pending) and restart
>>       the recovery.
>>
>> (3) If backup_label.pending exists but recovery.signal doesn't, the server
>>       ignores (or removes) backup_label.pending and do the recovery
>>       starting the pg_control's REDO location. This case can happen if
>>       the server crashes while an exclusive backup is in progress.
>>       So crash-in-the-middle-of-backup doesn't prevent the recovery from
>>       starting in this idea.
>
> The if-conditions for 1 and 2 appear to be the same, which is confusing.

I believe #1 is when backup_label (no .pending) exists, #2 is when backup_label.pending (with .pending) exists.

At the absolute minimum, this discussion has convinced me that we need to create a wiki page to accurately describe the failure scenarios for both exclusive and non-exclusive backups, and the recovery actions for them.  If it exists already, my search attempts weren't successful.  If it doesn't, I'm happy to start one.

That should not be a wiki page, really, that should be part of the main documentation. 

It can be drafted on the wiki of course, but we *really* should get something like that into the docs. Because that's what people are going to read. They'd only go look for the information on the wiki if they knew there was a problem and needed the details, and the big issue is that people don't know there are issues in the first place.

--

Re: Remove Deprecated Exclusive Backup Mode

From
Andres Freund
Date:
Hi,

On 2019-02-26 20:38:17 +0100, Magnus Hagander wrote:
> That should not be a wiki page, really, that should be part of the main
> documentation.

+1

> It can be drafted on the wiki of course, but we *really* should get
> something like that into the docs. Because that's what people are going to
> read. They'd only go look for the information on the wiki if they knew
> there was a problem and needed the details, and the big issue is that
> people don't know there are issues in the first place.

Also, there's so much wrong stuff on the wiki that people that know to
look there just don't believe what they read.

Greetings,

Andres Freund


Re: Remove Deprecated Exclusive Backup Mode

From
Peter Geoghegan
Date:
On Mon, Feb 25, 2019 at 10:49 PM David Steele <david@pgmasters.net> wrote:
> Worse, they have scripted the deletion of backup_label so that the
> cluster will restart on crash.  This is the recommendation from our
> documentation after all.  If that script runs after a restore instead of
> a crash, then the cluster will be corrupt -- silently.

I've always thought that the biggest problem is the documentation.
While the "Continuous Archiving and Point-in-Time Recovery (PITR)"
section of the user docs isn't quite as bad as it used to be, it's
still fundamentally structured "bottom up". That's bad.

I think that it would improve matters enormously if that section of
the documentation, and possibly the entire chapter was restructured to
be "top down". This should include prominent links to specific third
party projects, including Barman, pgBackRest, and others. This whole
area has become much more contentious than it needs to be, and I fear
that we cannot see the forest for the trees.

That said, I don't think that we should be in a hurry to fully remove
the exclusive backup mode. Home-spun backup solutions are a bad idea,
but removing exclusive mode strikes me as corrective over-reaction,
and a technical solution to an organizational problem. There are
definitely people that use the interface in a sensible way, despite
its shortcomings.

-- 
Peter Geoghegan


Re: Remove Deprecated Exclusive Backup Mode

From
Chapman Flack
Date:
On 2/26/19 2:46 PM, Andres Freund wrote:

> Also, there's so much wrong stuff on the wiki that people that know to
> look there just don't believe what they read.

Should there be a wiki errata page on the wiki?

I'm fairly serious ... for those times when you're reading the Foo
page on the wiki, and realize it still describes the way Foo worked
before 0abcde, and you don't feel you have time to write a polished
update ... go to the Errata page and add a link to Foo, the date, your
name, and "noticed this still describes how Foo worked before 0abcde".

Or is there a MediaWiki template already for that? You could add the
template right at the top of the page, and it would automagically be
added to the category "pages with errata".

That category could be a starting point for anybody who has a bit of
time to improve some wiki pages, and is looking for likely candidates.

Regards,
-Chap


Re: Remove Deprecated Exclusive Backup Mode

From
Laurenz Albe
Date:
Fujii Masao wrote:
> So, let me clarify the situations;
> 
> (3) If backup_label.pending exists but recovery.signal doesn't, the server
>        ignores (or removes) backup_label.pending and do the recovery
>        starting the pg_control's REDO location. This case can happen if
>        the server crashes while an exclusive backup is in progress.
>        So crash-in-the-middle-of-backup doesn't prevent the recovery from
>        starting in this idea

That's where I see the problem with your idea.

It is fairly easy for someone to restore a backup and then just start
the server without first creating "recovery.signal", and that would
lead to data corruption.

Indeed, if I didn't know much about databases and were faced with
the task of restoring a PostgreSQL database from backup in a hurry,
that is probably what I would do.

Having thought some about the whole problem area, I believe that there
is no safe way to disambiguate a backup from a server crashed in backup
mode.

The non-exclusive backup mode actually suffers from a similar problem ---
if you don't add "backup_label" to the backup afterwards, it will lead
to database corruption.

The only fool-proof tool we have in core is pg_basebackup.

Any really good backup solution that wishes to integrate with a third-
party backup tool will have to get that tool to include "backup_label"
with the backup itself.

For that, it would need to know the contents of "backup_label" *before*
pg_stop_backup().  Is there a good reason why it is pg_stop_backup()
and not pg_start_backup() that provides that information?

Yours,
Laurenz Albe



Re: Remove Deprecated Exclusive Backup Mode

From
Christophe Pettus
Date:

> On Feb 26, 2019, at 11:38, Magnus Hagander <magnus@hagander.net> wrote:
> That should not be a wiki page, really, that should be part of the main documentation.

I was just suggesting using a wiki page to draft it before we drop it into the main documentation.  I'm open to other
suggestions,of course! 

--
-- Christophe Pettus
   xof@thebuild.com



Re: Remove Deprecated Exclusive Backup Mode

From
Fujii Masao
Date:
On Wed, Feb 27, 2019 at 4:35 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> Fujii Masao wrote:
> > So, let me clarify the situations;
> >
> > (3) If backup_label.pending exists but recovery.signal doesn't, the server
> >        ignores (or removes) backup_label.pending and do the recovery
> >        starting the pg_control's REDO location. This case can happen if
> >        the server crashes while an exclusive backup is in progress.
> >        So crash-in-the-middle-of-backup doesn't prevent the recovery from
> >        starting in this idea
>
> That's where I see the problem with your idea.
>
> It is fairly easy for someone to restore a backup and then just start
> the server without first creating "recovery.signal", and that would
> lead to data corruption.

Isn't this case problematic even when a backup was taken by pg_basebackup?
Because of lack of recovery.signal, no archived WAL files are replayed and
the database may not reach to the latest point.

Regards,

-- 
Fujii Masao


Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Fujii Masao (masao.fujii@gmail.com) wrote:
> On Wed, Feb 27, 2019 at 4:35 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> >
> > Fujii Masao wrote:
> > > So, let me clarify the situations;
> > >
> > > (3) If backup_label.pending exists but recovery.signal doesn't, the server
> > >        ignores (or removes) backup_label.pending and do the recovery
> > >        starting the pg_control's REDO location. This case can happen if
> > >        the server crashes while an exclusive backup is in progress.
> > >        So crash-in-the-middle-of-backup doesn't prevent the recovery from
> > >        starting in this idea
> >
> > That's where I see the problem with your idea.
> >
> > It is fairly easy for someone to restore a backup and then just start
> > the server without first creating "recovery.signal", and that would
> > lead to data corruption.
>
> Isn't this case problematic even when a backup was taken by pg_basebackup?
> Because of lack of recovery.signal, no archived WAL files are replayed and
> the database may not reach to the latest point.

There's a number of problems, imv, that I think have just been
ignored/missed when it comes to the recovery.conf removal and tools like
pg_basebackup, see this msg also:

https://postgr.es/m/20181127153405.GX3415@tamriel.snowman.net

At this point, I think we need to put these issues on the v12 open items
page so that we don't forget about them and get these things fixed
before v12 comes out.

Thanks!

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 2/28/19 4:44 PM, Fujii Masao wrote:
> On Wed, Feb 27, 2019 at 4:35 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>>
>> Fujii Masao wrote:
>>> So, let me clarify the situations;
>>>
>>> (3) If backup_label.pending exists but recovery.signal doesn't, the server
>>>         ignores (or removes) backup_label.pending and do the recovery
>>>         starting the pg_control's REDO location. This case can happen if
>>>         the server crashes while an exclusive backup is in progress.
>>>         So crash-in-the-middle-of-backup doesn't prevent the recovery from
>>>         starting in this idea
>>
>> That's where I see the problem with your idea.
>>
>> It is fairly easy for someone to restore a backup and then just start
>> the server without first creating "recovery.signal", and that would
>> lead to data corruption.
> 
> Isn't this case problematic even when a backup was taken by pg_basebackup?
> Because of lack of recovery.signal, no archived WAL files are replayed and
> the database may not reach to the latest point.

It is problematic because we have a message encouraging users to delete 
backup_label when in fact they should be creating recovery.signal.

If the required WAL is present the cluster will not be corrupted.  It 
just may not play forward as far as you would prefer.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 2/27/19 8:22 PM, Christophe Pettus wrote:
> 
> 
>> On Feb 26, 2019, at 11:38, Magnus Hagander <magnus@hagander.net> wrote:
>> That should not be a wiki page, really, that should be part of the main documentation.
> 
> I was just suggesting using a wiki page to draft it before we drop it into the main documentation.  I'm open to other
suggestions,of course!
 

It seems to me that the best way to discuss this is via a patch to the 
main documentation.  I've written something to get us started:

https://commitfest.postgresql.org/22/2042/

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Martín Marqués
Date:
El 28/2/19 a las 15:13, David Steele escribió:
> 
> It seems to me that the best way to discuss this is via a patch to the
> main documentation.  I've written something to get us started:
> 
> https://commitfest.postgresql.org/22/2042/

+   <para>
+     The exclusive backup method is deprecated and should be avoided in
favor
+     of the non-exclusive backup method or
+     <application>pg_basebackup</application>.
+   </para>


Isn't pg_basebackup a non-exclusive backup method?

-- 
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: Remove Deprecated Exclusive Backup Mode

From
Michael Paquier
Date:
On Thu, Feb 28, 2019 at 11:07:47PM -0300, Martín Marqués wrote:
> El 28/2/19 a las 15:13, David Steele escribió:
> +   <para>
> +     The exclusive backup method is deprecated and should be avoided in
> favor
> +     of the non-exclusive backup method or
> +     <application>pg_basebackup</application>.
> +   </para>
>
> Isn't pg_basebackup a non-exclusive backup method?

It seems to me that it is exactly what this sentence means.  Perhaps
it should be splitted into two sentences for better clarity?
--
Michael

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
"David G. Johnston"
Date:
On Thu, Feb 28, 2019 at 7:51 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Thu, Feb 28, 2019 at 11:07:47PM -0300, Martín Marqués wrote:
> > El 28/2/19 a las 15:13, David Steele escribió:
> > +   <para>
> > +     The exclusive backup method is deprecated and should be avoided in
> > favor
> > +     of the non-exclusive backup method or
> > +     <application>pg_basebackup</application>.
> > +   </para>
> >
> > Isn't pg_basebackup a non-exclusive backup method?
>
> It seems to me that it is exactly what this sentence means.  Perhaps
> it should be splitted into two sentences for better clarity?

Comments on the documentation patch should be placed on the thread
linked to the documentation patch as opposed to here.

David J.


Re: Remove Deprecated Exclusive Backup Mode

From
Martín Marqués
Date:
El jue., 28 de feb. de 2019 a la(s) 23:50, Michael Paquier
(michael@paquier.xyz) escribió:
>
> On Thu, Feb 28, 2019 at 11:07:47PM -0300, Martín Marqués wrote:
> > El 28/2/19 a las 15:13, David Steele escribió:
> > +   <para>
> > +     The exclusive backup method is deprecated and should be avoided in
> > favor
> > +     of the non-exclusive backup method or
> > +     <application>pg_basebackup</application>.
> > +   </para>
> >
> > Isn't pg_basebackup a non-exclusive backup method?
>
> It seems to me that it is exactly what this sentence means.  Perhaps
> it should be splitted into two sentences for better clarity?

When I read that phrase I see 2 options: non-exclusive backups, and
pg_basebackup. It's as if pg_basebackup wasn't included in the former
group. It might bring some confusion.

--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 3/1/19 11:36 AM, Martín Marqués wrote:
> El jue., 28 de feb. de 2019 a la(s) 23:50, Michael Paquier
> (michael@paquier.xyz) escribió:
>>
>> On Thu, Feb 28, 2019 at 11:07:47PM -0300, Martín Marqués wrote:
>>> El 28/2/19 a las 15:13, David Steele escribió:
>>> +   <para>
>>> +     The exclusive backup method is deprecated and should be avoided in
>>> favor
>>> +     of the non-exclusive backup method or
>>> +     <application>pg_basebackup</application>.
>>> +   </para>
>>>
>>> Isn't pg_basebackup a non-exclusive backup method?
>>
>> It seems to me that it is exactly what this sentence means.  Perhaps
>> it should be splitted into two sentences for better clarity?
> 
> When I read that phrase I see 2 options: non-exclusive backups, and
> pg_basebackup. It's as if pg_basebackup wasn't included in the former
> group. It might bring some confusion.

This has been updated on the other thread and I CC'd you.

-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
Bruce Momjian
Date:
On Thu, Feb 28, 2019 at 05:08:24PM +0200, David Steele wrote:
> On 2/28/19 4:44 PM, Fujii Masao wrote:
> > On Wed, Feb 27, 2019 at 4:35 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > > 
> > > Fujii Masao wrote:
> > > > So, let me clarify the situations;
> > > > 
> > > > (3) If backup_label.pending exists but recovery.signal doesn't, the server
> > > >         ignores (or removes) backup_label.pending and do the recovery
> > > >         starting the pg_control's REDO location. This case can happen if
> > > >         the server crashes while an exclusive backup is in progress.
> > > >         So crash-in-the-middle-of-backup doesn't prevent the recovery from
> > > >         starting in this idea
> > > 
> > > That's where I see the problem with your idea.
> > > 
> > > It is fairly easy for someone to restore a backup and then just start
> > > the server without first creating "recovery.signal", and that would
> > > lead to data corruption.
> > 
> > Isn't this case problematic even when a backup was taken by pg_basebackup?
> > Because of lack of recovery.signal, no archived WAL files are replayed and
> > the database may not reach to the latest point.
> 
> It is problematic because we have a message encouraging users to delete
> backup_label when in fact they should be creating recovery.signal.

Should we issue a client NOTICE/WARNING message as part of
pg_start_backup to indicate what to do if the server crashes before
pg_stop_backup()?

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 3/5/19 12:36 AM, Bruce Momjian wrote:
> On Thu, Feb 28, 2019 at 05:08:24PM +0200, David Steele wrote:
>> On 2/28/19 4:44 PM, Fujii Masao wrote:
>>> On Wed, Feb 27, 2019 at 4:35 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>>>>
>>>> Fujii Masao wrote:
>>>>> So, let me clarify the situations;
>>>>>
>>>>> (3) If backup_label.pending exists but recovery.signal doesn't, the server
>>>>>          ignores (or removes) backup_label.pending and do the recovery
>>>>>          starting the pg_control's REDO location. This case can happen if
>>>>>          the server crashes while an exclusive backup is in progress.
>>>>>          So crash-in-the-middle-of-backup doesn't prevent the recovery from
>>>>>          starting in this idea
>>>>
>>>> That's where I see the problem with your idea.
>>>>
>>>> It is fairly easy for someone to restore a backup and then just start
>>>> the server without first creating "recovery.signal", and that would
>>>> lead to data corruption.
>>>
>>> Isn't this case problematic even when a backup was taken by pg_basebackup?
>>> Because of lack of recovery.signal, no archived WAL files are replayed and
>>> the database may not reach to the latest point.
>>
>> It is problematic because we have a message encouraging users to delete
>> backup_label when in fact they should be creating recovery.signal.
> 
> Should we issue a client NOTICE/WARNING message as part of
> pg_start_backup to indicate what to do if the server crashes before
> pg_stop_backup()?

I'm not in favor of this since it will just encourage users to change 
their log level and possibly miss important notices/warnings.

Regards,
-- 
-David
david@pgmasters.net


Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 12/11/18 5:24 PM, David Steele wrote:
> 
> Attached is the patch.
> 
> I was a bit surprised by how much code went away.  There was a lot of
> code dedicated to making sure that backup_label got renamed on shutdown,
> that there was not an exclusive backup running, etc.
> 
> There were no tests for exclusive backup so the test changes were minimal.
> 
> I did have to replace one "hot backup" in
> 010_logical_decoding_timelines.pl.  I'm not sure why the backup was
> being done this way, or why the standby needs a copy of pg_replslot
> (which is not copied by pg_basebackup).  It might be worth looking at
> but it seemed out of scope for this patch.

Rebased patch is attached.

I also fixed the issue that Robert pointed out with regard to changing 
the function signatures. It's a bit weird that there are optional 
parameters when the last parameter is really not optional (i.e. the 
default will lead to an error). Any suggestions are appreciated.

Reading back through the thread, the major objection to the original 
patch was that it was submitted too soon. Hopefully the elapsed time 
addresses that concern.

Another common objection is that the new API is hard to use from bash 
scripts. This is arguably still true, but Laurenz has demonstrated that 
bash is capable of this feat:

https://github.com/cybertec-postgresql/safe-backup

Lastly, there is some concern about getting the backup label too late 
when doing snapshots or using traditional backup software. It would 
certainly be possible to return the backup_label and tablespace_map from 
pg_start_backup() and let the user make the choice about what to do with 
them. Any of these methods will require some scripting so I don't see 
why the files couldn't be written as, for example, backup_label.snap and 
tablespace_map.snap and then renamed by the restore script. The patch 
does not currently make this change, but it could be added pretty easily 
if that overcomes this objection.

Regards,
-- 
-David
david@pgmasters.net

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* David Steele (david@pgmasters.net) wrote:
> Lastly, there is some concern about getting the backup label too late when
> doing snapshots or using traditional backup software. It would certainly be
> possible to return the backup_label and tablespace_map from
> pg_start_backup() and let the user make the choice about what to do with
> them. Any of these methods will require some scripting so I don't see why
> the files couldn't be written as, for example, backup_label.snap and
> tablespace_map.snap and then renamed by the restore script. The patch does
> not currently make this change, but it could be added pretty easily if that
> overcomes this objection.

Of course, if someone just restored that snapshot without actually
moving the files into place they'd get a corrupted database without any
indication of anything being wrong...

And if we checked for those files on startup and complained about them
being there (called .snap) then we're back into the "crash during backup
causes PG to fail to start" situation.

All of which is, as I recall, is why we have the API we do today.

As such, doing that doesn't strike me as an improvement over using the
new API and making it abundently clear that it's not so simple as people
might like to think it is...

Thanks,

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 6/30/20 6:13 PM, Stephen Frost wrote:
> Greetings,
> 
> * David Steele (david@pgmasters.net) wrote:
>> Lastly, there is some concern about getting the backup label too late when
>> doing snapshots or using traditional backup software. It would certainly be
>> possible to return the backup_label and tablespace_map from
>> pg_start_backup() and let the user make the choice about what to do with
>> them. Any of these methods will require some scripting so I don't see why
>> the files couldn't be written as, for example, backup_label.snap and
>> tablespace_map.snap and then renamed by the restore script. The patch does
>> not currently make this change, but it could be added pretty easily if that
>> overcomes this objection.
> 
> Of course, if someone just restored that snapshot without actually
> moving the files into place they'd get a corrupted database without any
> indication of anything being wrong...
> 
> And if we checked for those files on startup and complained about them
> being there (called .snap) then we're back into the "crash during backup
> causes PG to fail to start" situation.
> 
> All of which is, as I recall, is why we have the API we do today.

I'm certainly not proposing that we ignore .snap files (or whatever). 
There are a many ways a restore can be done incorrectly and not be 
detected. The restore script would be responsible for knowing the rules 
and renaming the files or erroring out.

> As such, doing that doesn't strike me as an improvement over using the
> new API and making it abundently clear that it's not so simple as people
> might like to think it is...

Sure, backup is complicated, but I think there's an argument for 
providing the backup_label and tablespace_map files at the beginning of 
the backup since they are available at that point. And maybe put some 
scary language about storing them in PGDATA.

Regards,
-- 
-David
david@pgmasters.net



Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* David Steele (david@pgmasters.net) wrote:
> On 6/30/20 6:13 PM, Stephen Frost wrote:
> >* David Steele (david@pgmasters.net) wrote:
> >>Lastly, there is some concern about getting the backup label too late when
> >>doing snapshots or using traditional backup software. It would certainly be
> >>possible to return the backup_label and tablespace_map from
> >>pg_start_backup() and let the user make the choice about what to do with
> >>them. Any of these methods will require some scripting so I don't see why
> >>the files couldn't be written as, for example, backup_label.snap and
> >>tablespace_map.snap and then renamed by the restore script. The patch does
> >>not currently make this change, but it could be added pretty easily if that
> >>overcomes this objection.
> >
> >Of course, if someone just restored that snapshot without actually
> >moving the files into place they'd get a corrupted database without any
> >indication of anything being wrong...
> >
> >And if we checked for those files on startup and complained about them
> >being there (called .snap) then we're back into the "crash during backup
> >causes PG to fail to start" situation.
> >
> >All of which is, as I recall, is why we have the API we do today.
>
> I'm certainly not proposing that we ignore .snap files (or whatever). There
> are a many ways a restore can be done incorrectly and not be detected. The
> restore script would be responsible for knowing the rules and renaming the
> files or erroring out.

If we don't ignore them then what are we going to do when they exist..?

> >As such, doing that doesn't strike me as an improvement over using the
> >new API and making it abundently clear that it's not so simple as people
> >might like to think it is...
>
> Sure, backup is complicated, but I think there's an argument for providing
> the backup_label and tablespace_map files at the beginning of the backup
> since they are available at that point. And maybe put some scary language
> about storing them in PGDATA.

but still require that the connection be kept open until the
pg_stop_backup() is called?  I suppose we could do that but it seems
unlikely to change much for anyone.

De-deprecating the exclusive backup mode and returning that info at the
start and telling users "put this anywhere EXCEPT this specific place,
and make sure you DO put these files in that exact place when you
restore" would perhaps be closer to what folks who really, really, want
to do snapshot-based backups would want... if that ends up being a net
positive for users or not is questionable imv.  One could imagine a set
of scripts that would save that info to a directory above PGDATA but on
the appropriate volume to be included in the snapshot, and then on
restore would put them into the right place to allow PG to restore from
the backup properly, removing the complication around having to save
that data elsewhere or snapshot another filesystem after the backup is
run or such.

Thanks,

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Bruce Momjian
Date:
On Tue, Jun 30, 2020 at 07:10:46PM -0400, Stephen Frost wrote:
> De-deprecating the exclusive backup mode and returning that info at the
> start and telling users "put this anywhere EXCEPT this specific place,
> and make sure you DO put these files in that exact place when you
> restore" would perhaps be closer to what folks who really, really, want
> to do snapshot-based backups would want... if that ends up being a net

Agreed.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Bruce Momjian (bruce@momjian.us) wrote:
> On Tue, Jun 30, 2020 at 07:10:46PM -0400, Stephen Frost wrote:
> > De-deprecating the exclusive backup mode and returning that info at the
> > start and telling users "put this anywhere EXCEPT this specific place,
> > and make sure you DO put these files in that exact place when you
> > restore" would perhaps be closer to what folks who really, really, want
> > to do snapshot-based backups would want... if that ends up being a net
>
> Agreed.

Alright, in that case I think we need a victim^Wvolunteer to step up to
rewrite the documentation for backup for these two different methods,
with very clear instructions on how to do this properly (which really
should also include removing the particularly bad 'cp' command usage).

Maybe I'm wrong, but as I recall David already tried to do that once and
I doubt he's going to be interested in another round of that.

Otherwise, it seems best to stop trying to paint this as something
that's simple to do and rip the bandaid off and remove this long
deprecated method, so we can have some sane documentation for the
software writers who are writing backup solutions for PG.

Thanks,

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Bruce Momjian
Date:
On Tue, Jun 30, 2020 at 07:32:47PM -0400, Stephen Frost wrote:
> Greetings,
> 
> * Bruce Momjian (bruce@momjian.us) wrote:
> > On Tue, Jun 30, 2020 at 07:10:46PM -0400, Stephen Frost wrote:
> > > De-deprecating the exclusive backup mode and returning that info at the
> > > start and telling users "put this anywhere EXCEPT this specific place,
> > > and make sure you DO put these files in that exact place when you
> > > restore" would perhaps be closer to what folks who really, really, want
> > > to do snapshot-based backups would want... if that ends up being a net
> > 
> > Agreed.
> 
> Alright, in that case I think we need a victim^Wvolunteer to step up to
> rewrite the documentation for backup for these two different methods,
> with very clear instructions on how to do this properly (which really
> should also include removing the particularly bad 'cp' command usage).
> 
> Maybe I'm wrong, but as I recall David already tried to do that once and
> I doubt he's going to be interested in another round of that.
> 
> Otherwise, it seems best to stop trying to paint this as something
> that's simple to do and rip the bandaid off and remove this long
> deprecated method, so we can have some sane documentation for the
> software writers who are writing backup solutions for PG.

I can work on it but I don't think I know all the details.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: Remove Deprecated Exclusive Backup Mode

From
Laurenz Albe
Date:
On Tue, 2020-06-30 at 17:38 -0400, David Steele wrote:
> Rebased patch is attached.

I remain -1 on removing the exclusive backup API.

People who don't mind manual intervention when PostgreSQL doesn't
start after a crash in backup mode could safely use it.

We have been over this, and there is not much point in re-opening
the discussion.  I just had to speak up.

> Lastly, there is some concern about getting the backup label too late 
> when doing snapshots or using traditional backup software. It would 
> certainly be possible to return the backup_label and tablespace_map from 
> pg_start_backup() and let the user make the choice about what to do with 
> them. Any of these methods will require some scripting so I don't see 
> why the files couldn't be written as, for example, backup_label.snap and 
> tablespace_map.snap and then renamed by the restore script. The patch 
> does not currently make this change, but it could be added pretty easily 
> if that overcomes this objection.

That would be interesting improvements that would make the non-exclusive
backup API easier to use.

Yours,
Laurenz Albe




Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> On Tue, 2020-06-30 at 17:38 -0400, David Steele wrote:
> > Rebased patch is attached.
>
> I remain -1 on removing the exclusive backup API.

It's still deprecated and I'm certainly against removing that status
until/unless someone actually fixes it (including documentation), and if
we're not going to do that then we should remove it.

> People who don't mind manual intervention when PostgreSQL doesn't
> start after a crash in backup mode could safely use it.

I disagree that that's really appropriate.

> We have been over this, and there is not much point in re-opening
> the discussion.  I just had to speak up.

It's not ok to just leave it as deprecated forever.

> > Lastly, there is some concern about getting the backup label too late
> > when doing snapshots or using traditional backup software. It would
> > certainly be possible to return the backup_label and tablespace_map from
> > pg_start_backup() and let the user make the choice about what to do with
> > them. Any of these methods will require some scripting so I don't see
> > why the files couldn't be written as, for example, backup_label.snap and
> > tablespace_map.snap and then renamed by the restore script. The patch
> > does not currently make this change, but it could be added pretty easily
> > if that overcomes this objection.
>
> That would be interesting improvements that would make the non-exclusive
> backup API easier to use.

This would presumably be for the exclusive API as a way to make it not
completely broken, maybe.

If we wanted to try and make this work in a non-exclusive manner then
we'd need to do something like have the user save some info out at
pg_start_backup time that they then provide at pg_stop_backup time, so
we can match up the specific backup and write the appropriate WAL
message.

Thanks,

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Laurenz Albe
Date:
On Wed, 2020-07-01 at 08:47 -0400, Stephen Frost wrote:
> > I remain -1 on removing the exclusive backup API.
> 
> It's still deprecated and I'm certainly against removing that status
> until/unless someone actually fixes it (including documentation), and if
> we're not going to do that then we should remove it.

Well, it doesn't need fixing, since it is working just fine (for certain
values of "just fine").

I agree with the need to document the problems better.

Yours,
Laurenz Albe




Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> On Wed, 2020-07-01 at 08:47 -0400, Stephen Frost wrote:
> > > I remain -1 on removing the exclusive backup API.
> >
> > It's still deprecated and I'm certainly against removing that status
> > until/unless someone actually fixes it (including documentation), and if
> > we're not going to do that then we should remove it.
>
> Well, it doesn't need fixing, since it is working just fine (for certain
> values of "just fine").

imv it's a disservice to our users to claim it's working when there's a
myriad of problems with it that we only allude to in the documentation,
some of which simply have no way of being addressed by a user.

Perhaps if we remove the landmines we can say it works, but imv it
doesn't today.  I suspect we'll simply not get to a point where we agree
on this, but if you'd like to propose a patch that fixes the issues and
the documentation, that'd be a great direction to take this.  If that's
not going to happen, then I'm in favor of removing it.  I really dislike
these half-measures and keeping deprecated things that complicate our
code and our documentation forever.

Thanks,

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:


On Wed, Jul 1, 2020 at 2:47 PM Stephen Frost <sfrost@snowman.net> wrote:

This would presumably be for the exclusive API as a way to make it not
completely broken, maybe.

If we wanted to try and make this work in a non-exclusive manner then
we'd need to do something like have the user save some info out at
pg_start_backup time that they then provide at pg_stop_backup time, so
we can match up the specific backup and write the appropriate WAL
message.

We don't even need to make it an exclusive mode -- we can allow *nonexclusive* backups but remove the requirement to run start and stop backup in the same connection, which I believe is the problem that people have with the exclusive mode. So how about something like this:

1. Make pg_start_backup() return the backup label file as well. We can add an extra column to the output without breaking backwards compatibility. And since we have all the information for the file at pg_start_backup() time, the user can then write that into the backup. We clearly document that this should *not* be written as "backup_label" in the data directory. We can even define what it should be instead.

2. Make pg_start_backup() also return a "cookie" value with a unique identifier if asked to run in "disconnectable mode". Store this identifier in shared state somewhere in the backend.

3. Make a version of pg_stop_backup that takes this cookie as a parameter, and that way supports stopping a "disconnectable backup".

4. Scream loudly in the logs if we shut down postgres with any open "disconnectable backups" in progress. (to let people know they screwed up)

5. Perhaps provide a row in pg_stat_progress_basebackup, or in it's own view, to show these "disconnectable mode backups"?
5b. In fact, maybe provide something like that for the current non-exclusive ones as well, in case people have hung sessions.
 

The weak spot in this is still the "don't write it as backup label", but we can document that. And that would allow us to do non-exclusive base backups without requiring maintaining the connection. And we can completely get rid of the exclusive ones.

--

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Magnus Hagander (magnus@hagander.net) wrote:
> On Wed, Jul 1, 2020 at 2:47 PM Stephen Frost <sfrost@snowman.net> wrote:
> > This would presumably be for the exclusive API as a way to make it not
> > completely broken, maybe.
> >
> > If we wanted to try and make this work in a non-exclusive manner then
> > we'd need to do something like have the user save some info out at
> > pg_start_backup time that they then provide at pg_stop_backup time, so
> > we can match up the specific backup and write the appropriate WAL
> > message.
>
> We don't even need to make it an exclusive mode -- we can allow
> *nonexclusive* backups but remove the requirement to run start and stop
> backup in the same connection, which I believe is the problem that people
> have with the exclusive mode. So how about something like this:
>
> 1. Make pg_start_backup() return the backup label file as well. We can add
> an extra column to the output without breaking backwards compatibility. And
> since we have all the information for the file at pg_start_backup() time,
> the user can then write that into the backup. We clearly document that this
> should *not* be written as "backup_label" in the data directory. We can
> even define what it should be instead.
>
> 2. Make pg_start_backup() also return a "cookie" value with a
> unique identifier if asked to run in "disconnectable mode". Store this
> identifier in shared state somewhere in the backend.

Seems like we could possibly just make this be the WAL position the
backup started at, since we use that as the finishing location...?  I
get that users could screw up passing the value back in and get a
corrupted backup, but it'd avoid us having to stick anything in shared
memory, wouldn't it?

> 3. Make a version of pg_stop_backup that takes this cookie as a parameter,
> and that way supports stopping a "disconnectable backup".
>
> 4. Scream loudly in the logs if we shut down postgres with any open
> "disconnectable backups" in progress. (to let people know they screwed up)

I'm less sure how useful that'd be, but ok.

> 5. Perhaps provide a row in pg_stat_progress_basebackup, or in it's own
> view, to show these "disconnectable mode backups"?
> 5b. In fact, maybe provide something like that for the current
> non-exclusive ones as well, in case people have hung sessions.

Ideally we'd provide a way for external backup tools to update said
progress with information of their own like the % complete, if they wish
to...

> The weak spot in this is still the "don't write it as backup label", but we
> can document that. And that would allow us to do non-exclusive base backups
> without requiring maintaining the connection. And we can completely get rid
> of the exclusive ones.

If this helps us get rid of exclusive backup mode then that's certainly
helpful, though inventing yet another method of doing backup makes me
cringe to think of the documentation complexity since it doesn't seem
like we'd really be reducing that.  Now, if we also removed the existing
non-exclusive method and then had a *single* backup method with clear
documentation as to how to use it properly, that'd be a marked
improvement overall.

Thanks,

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:


On Wed, Jul 1, 2020 at 6:29 PM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

* Magnus Hagander (magnus@hagander.net) wrote:
> On Wed, Jul 1, 2020 at 2:47 PM Stephen Frost <sfrost@snowman.net> wrote:
> > This would presumably be for the exclusive API as a way to make it not
> > completely broken, maybe.
> >
> > If we wanted to try and make this work in a non-exclusive manner then
> > we'd need to do something like have the user save some info out at
> > pg_start_backup time that they then provide at pg_stop_backup time, so
> > we can match up the specific backup and write the appropriate WAL
> > message.
>
> We don't even need to make it an exclusive mode -- we can allow
> *nonexclusive* backups but remove the requirement to run start and stop
> backup in the same connection, which I believe is the problem that people
> have with the exclusive mode. So how about something like this:
>
> 1. Make pg_start_backup() return the backup label file as well. We can add
> an extra column to the output without breaking backwards compatibility. And
> since we have all the information for the file at pg_start_backup() time,
> the user can then write that into the backup. We clearly document that this
> should *not* be written as "backup_label" in the data directory. We can
> even define what it should be instead.
>
> 2. Make pg_start_backup() also return a "cookie" value with a
> unique identifier if asked to run in "disconnectable mode". Store this
> identifier in shared state somewhere in the backend.

Seems like we could possibly just make this be the WAL position the
backup started at, since we use that as the finishing location...?  I
get that users could screw up passing the value back in and get a
corrupted backup, but it'd avoid us having to stick anything in shared
memory, wouldn't it?

Or, how about we require them to provide the backup label contents in its entirety? Which I believe does contain that WAL portion?  The downside of that is that it would be multiline which might screw with shellscripts.

We would still need to stick *something* in there so we can keep track of when it's done. You should only be able to stop each backup once for example.  Otherwise, you'll end up with XLogCtl->Insert.nonExclusiveBackups being wrong if someone calls stop twice with the same cookie/wal location.

 
> 3. Make a version of pg_stop_backup that takes this cookie as a parameter,
> and that way supports stopping a "disconnectable backup".
>
> 4. Scream loudly in the logs if we shut down postgres with any open
> "disconnectable backups" in progress. (to let people know they screwed up)

I'm less sure how useful that'd be, but ok.

> 5. Perhaps provide a row in pg_stat_progress_basebackup, or in it's own
> view, to show these "disconnectable mode backups"?
> 5b. In fact, maybe provide something like that for the current
> non-exclusive ones as well, in case people have hung sessions.

Ideally we'd provide a way for external backup tools to update said
progress with information of their own like the % complete, if they wish
to...

That's definitely moving the goalposts on this several miles :) Not saying that wouldn't be nice, but let's keep those separate :P


> The weak spot in this is still the "don't write it as backup label", but we
> can document that. And that would allow us to do non-exclusive base backups
> without requiring maintaining the connection. And we can completely get rid
> of the exclusive ones.

If this helps us get rid of exclusive backup mode then that's certainly
helpful, though inventing yet another method of doing backup makes me
cringe to think of the documentation complexity since it doesn't seem
like we'd really be reducing that.  Now, if we also removed the existing
non-exclusive method and then had a *single* backup method with clear
documentation as to how to use it properly, that'd be a marked
improvement overall.

We could do that, but the passing of cookies or whatever is completely unnecessary with the current method so it would complicate that one...


--

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Magnus Hagander (magnus@hagander.net) wrote:
> On Wed, Jul 1, 2020 at 6:29 PM Stephen Frost <sfrost@snowman.net> wrote:
> > * Magnus Hagander (magnus@hagander.net) wrote:
> > > On Wed, Jul 1, 2020 at 2:47 PM Stephen Frost <sfrost@snowman.net> wrote:
> > > > This would presumably be for the exclusive API as a way to make it not
> > > > completely broken, maybe.
> > > >
> > > > If we wanted to try and make this work in a non-exclusive manner then
> > > > we'd need to do something like have the user save some info out at
> > > > pg_start_backup time that they then provide at pg_stop_backup time, so
> > > > we can match up the specific backup and write the appropriate WAL
> > > > message.
> > >
> > > We don't even need to make it an exclusive mode -- we can allow
> > > *nonexclusive* backups but remove the requirement to run start and stop
> > > backup in the same connection, which I believe is the problem that people
> > > have with the exclusive mode. So how about something like this:
> > >
> > > 1. Make pg_start_backup() return the backup label file as well. We can
> > add
> > > an extra column to the output without breaking backwards compatibility.
> > And
> > > since we have all the information for the file at pg_start_backup() time,
> > > the user can then write that into the backup. We clearly document that
> > this
> > > should *not* be written as "backup_label" in the data directory. We can
> > > even define what it should be instead.
> > >
> > > 2. Make pg_start_backup() also return a "cookie" value with a
> > > unique identifier if asked to run in "disconnectable mode". Store this
> > > identifier in shared state somewhere in the backend.
> >
> > Seems like we could possibly just make this be the WAL position the
> > backup started at, since we use that as the finishing location...?  I
> > get that users could screw up passing the value back in and get a
> > corrupted backup, but it'd avoid us having to stick anything in shared
> > memory, wouldn't it?
>
> Or, how about we require them to provide the backup label contents in its
> entirety? Which I believe does contain that WAL portion?  The downside of
> that is that it would be multiline which might screw with shellscripts.

This is getting a bit beyond the pale imv if we're talking about trying
to cater to software that can't manage to handle a multi-line result or
multi-line argument to a function.

> We would still need to stick *something* in there so we can keep track of
> when it's done. You should only be able to stop each backup once for
> example.  Otherwise, you'll end up with XLogCtl->Insert.nonExclusiveBackups
> being wrong if someone calls stop twice with the same cookie/wal location.

Hrmpf, yeah, I suppose we can't really avoid having to track it somehow,
in which case, yeah, we might as well just have some unique ID for each
that we then look up in shared memory (though that unique ID could be
the starting WAL if we wanted, but I don't suppose it particularly
matters if it's that or something else).  We'd need to have some
parameter like max_concurrent_backups or something tho, which I was
hoping to avoid.

> > > 5. Perhaps provide a row in pg_stat_progress_basebackup, or in it's own
> > > view, to show these "disconnectable mode backups"?
> > > 5b. In fact, maybe provide something like that for the current
> > > non-exclusive ones as well, in case people have hung sessions.
> >
> > Ideally we'd provide a way for external backup tools to update said
> > progress with information of their own like the % complete, if they wish
> > to...
>
> That's definitely moving the goalposts on this several miles :) Not saying
> that wouldn't be nice, but let's keep those separate :P

Well, you brought up progress monitoring. :)

> > The weak spot in this is still the "don't write it as backup label", but
> > we
> > > can document that. And that would allow us to do non-exclusive base
> > backups
> > > without requiring maintaining the connection. And we can completely get
> > rid
> > > of the exclusive ones.
> >
> > If this helps us get rid of exclusive backup mode then that's certainly
> > helpful, though inventing yet another method of doing backup makes me
> > cringe to think of the documentation complexity since it doesn't seem
> > like we'd really be reducing that.  Now, if we also removed the existing
> > non-exclusive method and then had a *single* backup method with clear
> > documentation as to how to use it properly, that'd be a marked
> > improvement overall.
>
> We could do that, but the passing of cookies or whatever is completely
> unnecessary with the current method so it would complicate that one...

No, it wouldn't complicate "that one" since "that one" would be gone.
Yes, it'd be a slightly more complicated process than what they handle
currently but I have confidence that everything that's, today, arranging
to keep a PG connection open would be able to handle instead (or, in
addition to, if they wanted to continue having that connection anyway)
passing the cookie through from start->stop backup, and I do think it
would massively simplify the documentation which would make it well
worth it.

Thanks,

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Wed, Jul 1, 2020 at 11:48 AM Magnus Hagander <magnus@hagander.net> wrote:
> We don't even need to make it an exclusive mode -- we can allow *nonexclusive* backups but remove the requirement to
runstart and stop backup in the same connection, which I believe is the problem that people have with the exclusive
mode.

Well, one of the big advantages of the non-exclusive mode is that the
backup automatically is terminated if the connection is closed, which
removes the possibility of leaving a backup in progress for a long
time inadvertently. I think that back-tracking on that would probably
be a step in the wrong direction.

I agree with what I understand to be Bruce's position -- non-exclusive
backup mode is better than exclusive mode backup mode, but exclusive
backup mode is not broken and doesn't need to be removed. I think it's
very sensible to continue to support exclusive backup mode but to
encourage users to use non-exclusive backup mode instead. I find
Stephen's position that exclusive backup mode cannot be used safely to
be unsupported by any compelling evidence, and reviewing the thread, I
find that there are a LOT of people who seem to agree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Remove Deprecated Exclusive Backup Mode

From
Bruce Momjian
Date:
On Wed, Jul  1, 2020 at 01:23:21PM -0400, Stephen Frost wrote:
> > Or, how about we require them to provide the backup label contents in its
> > entirety? Which I believe does contain that WAL portion?  The downside of
> > that is that it would be multiline which might screw with shellscripts.
> 
> This is getting a bit beyond the pale imv if we're talking about trying
> to cater to software that can't manage to handle a multi-line result or
> multi-line argument to a function.

Seems to work for me in bash:

    $ X='a
    > b
    > c'

    $ echo "$X"
    a
    b
    c

    $ echo $X
    a b c

You need the double quotes around the variable name.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:
On Wed, Jul 1, 2020 at 9:43 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jul 1, 2020 at 11:48 AM Magnus Hagander <magnus@hagander.net> wrote:
> We don't even need to make it an exclusive mode -- we can allow *nonexclusive* backups but remove the requirement to run start and stop backup in the same connection, which I believe is the problem that people have with the exclusive mode.

Well, one of the big advantages of the non-exclusive mode is that the
backup automatically is terminated if the connection is closed, which
removes the possibility of leaving a backup in progress for a long
time inadvertently. I think that back-tracking on that would probably
be a step in the wrong direction.

I am in no way suggesting we should do this to replace the nonexclusive backups the way they are now. They would still retain this property, and should be encouraged to be the default way.

I am only suggesting  this as an alternative to the exclusive ones, which hopefully can get rid of most of the drawbacks of the exclusive mode ones. Yes, it would still lack the ability to  automatically terminate  (except it *would* do such a termination if the server restarted -- instead of like the current exclusive mode which just breaks the server then).


I agree with what I understand to be Bruce's position -- non-exclusive
backup mode is better than exclusive mode backup mode, but exclusive
backup mode is not broken and doesn't need to be removed. I think it's
very sensible to continue to support exclusive backup mode but to
encourage users to use non-exclusive backup mode instead. I find
Stephen's position that exclusive backup mode cannot be used safely to
be unsupported by any compelling evidence, and reviewing the thread, I
find that there are a LOT of people who seem to agree.


I think that is what I'm suggesting -- that is, I'm suggesting a way to solve the actual problems that people have with non-exclusive mode, to the point of making it possible ot run those without the drawbacks.

As far as I've seen, the one thing that people have problems with in the exclusive mode backups are precisely the fact that they have to keep a persistent conneciton open, and thus it cannot work together with backup software that is limited to only supporting running a pre- and a post script. 

Something like I have suggested here is to solve *that* problem. I don't think anybody actually explicitly wants "exclusive backups" -- they want a backup solution that plugs into their world of pre/post scripts. And if we can make that one work in a safer way than the current exclusive backups, ohw is that not an improvement?

//Magnus

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Jul 1, 2020 at 11:48 AM Magnus Hagander <magnus@hagander.net> wrote:
> > We don't even need to make it an exclusive mode -- we can allow *nonexclusive* backups but remove the requirement
torun start and stop backup in the same connection, which I believe is the problem that people have with the exclusive
mode.
>
> Well, one of the big advantages of the non-exclusive mode is that the
> backup automatically is terminated if the connection is closed, which
> removes the possibility of leaving a backup in progress for a long
> time inadvertently. I think that back-tracking on that would probably
> be a step in the wrong direction.

What's the actual downside from having a forgotten backup being left
around for an extended period of time..?

> I agree with what I understand to be Bruce's position -- non-exclusive
> backup mode is better than exclusive mode backup mode, but exclusive
> backup mode is not broken and doesn't need to be removed. I think it's
> very sensible to continue to support exclusive backup mode but to
> encourage users to use non-exclusive backup mode instead. I find
> Stephen's position that exclusive backup mode cannot be used safely to
> be unsupported by any compelling evidence, and reviewing the thread, I
> find that there are a LOT of people who seem to agree.

I didn't realize there was a question regarding what happens if the DB
server ends up crashing during an exclusive backup.  That hardly seems
to be safe if you care about having your database start up after a
crash, which seems like an important thing that we generally consider of
value considering the amount of effort we put into it.

I agree that there seems to be a number of people who want to ignore
that the system breaks down in that situation in order to keep exclusive
backup around, but, unlike the discussion being had between Magnus and
I, that doesn't actually move us forward at all in terms of providing a
good solution or being able to write the documentation in a sane way for
people who aren't PG hackers to be able to follow.

Thanks,

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Wed, Jul 1, 2020 at 3:50 PM Magnus Hagander <magnus@hagander.net> wrote:
> As far as I've seen, the one thing that people have problems with in the exclusive mode backups are precisely the
factthat they have to keep a persistent conneciton open, and thus it cannot work together with backup software that is
limitedto only supporting running a pre- and a post script. 
>
> Something like I have suggested here is to solve *that* problem. I don't think anybody actually explicitly wants
"exclusivebackups" -- they want a backup solution that plugs into their world of pre/post scripts. And if we can make
thatone work in a safer way than the current exclusive backups, ohw is that not an improvement? 

Yeah, I guess that's a pretty fair point. I have to confess to having
somewhat limited enthusiasm for adding a third mode here, but it might
be worth it.

It seems pretty well inevitable to me that people are going to forget
to end them. I am not sure exactly what the consequences of that will
be, but if for example there's a limited number of shared memory slots
to store information about these backups, then if you leak any, you'll
eventually run out of slots and your backups will start failing. I
feel like that's a going to happen to about 75% of the people who try
to use this new backup mode at some point in time, but maybe I'm a
pessimist.[1]

If we could jigger things so that you don't need to stop the backup at
all, you only start it, and whether you ever finish copying everything
is something about which the system need not know or care, that would
be a lot nicer. I'm not sure I see how to do that, though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] OK, that's not really a "maybe".



Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:
On Wed, Jul 1, 2020 at 9:59 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jul 1, 2020 at 3:50 PM Magnus Hagander <magnus@hagander.net> wrote:
> As far as I've seen, the one thing that people have problems with in the exclusive mode backups are precisely the fact that they have to keep a persistent conneciton open, and thus it cannot work together with backup software that is limited to only supporting running a pre- and a post script.
>
> Something like I have suggested here is to solve *that* problem. I don't think anybody actually explicitly wants "exclusive backups" -- they want a backup solution that plugs into their world of pre/post scripts. And if we can make that one work in a safer way than the current exclusive backups, ohw is that not an improvement?

Yeah, I guess that's a pretty fair point. I have to confess to having
somewhat limited enthusiasm for adding a third mode here, but it might
be worth it.

The intention is definitely not to have 3 modes. If we build this mode the intention is that it is strictly better than exclusive mode, and thus exclusive mode can finally be removed. (Whereas the nonexclusive mode is better in many ways, but not all)


It seems pretty well inevitable to me that people are going to forget
to end them. I am not sure exactly what the consequences of that will
be, but if for example there's a limited number of shared memory slots
to store information about these backups, then if you leak any, you'll
eventually run out of slots and your backups will start failing. I
feel like that's a going to happen to about 75% of the people who try
to use this new backup mode at some point in time, but maybe I'm a
pessimist.[1]

Agreed. yet this is still strictly better or equal than the current exclusive backups, which fail after forgetting that *once*.


If we could jigger things so that you don't need to stop the backup at
all, you only start it, and whether you ever finish copying everything
is something about which the system need not know or care, that would
be a lot nicer. I'm not sure I see how to do that, though.


Yeah, I'm not sure about that one either.

//Magnus
 

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 7/1/20 3:58 PM, Robert Haas wrote:
> On Wed, Jul 1, 2020 at 3:50 PM Magnus Hagander <magnus@hagander.net> wrote:
>> As far as I've seen, the one thing that people have problems with in the exclusive mode backups are precisely the
factthat they have to keep a persistent conneciton open, and thus it cannot work together with backup software that is
limitedto only supporting running a pre- and a post script.
 
>>
>> Something like I have suggested here is to solve *that* problem. I don't think anybody actually explicitly wants
"exclusivebackups" -- they want a backup solution that plugs into their world of pre/post scripts. And if we can make
thatone work in a safer way than the current exclusive backups, ohw is that not an improvement?
 
> 
> Yeah, I guess that's a pretty fair point. I have to confess to having
> somewhat limited enthusiasm for adding a third mode here, but it might
> be worth it.
> 
> It seems pretty well inevitable to me that people are going to forget
> to end them. I am not sure exactly what the consequences of that will
> be, but if for example there's a limited number of shared memory slots
> to store information about these backups, then if you leak any, you'll
> eventually run out of slots and your backups will start failing. I
> feel like that's a going to happen to about 75% of the people who try
> to use this new backup mode at some point in time, but maybe I'm a
> pessimist.[1]

I did consider going down this road and I came to more or less the same 
conclusion. Backups would accumulate until we just ran out of space 
and/or started throwing errors at some limit. Hardly very attractive.

Also, full pages writes will be left on indefinitely.

> If we could jigger things so that you don't need to stop the backup at
> all, you only start it, and whether you ever finish copying everything
> is something about which the system need not know or care, that would
> be a lot nicer. I'm not sure I see how to do that, though.

Well, the only thing pg_stop_backup() *really* needs to know is the 
starting WAL position. pg_start_backup() gets that info so if it passes 
it back to pg_stop_backup() that could be enough. Or as was proposed 
above, it just passes the backup_label back to pg_stop_backup() for 
parsing. To write the .backup file to WAL (which I personally think is 
pretty useless) you'd need backup_label.

The issue I have not been able to work around is that full page writes 
need to be on for backups and they won't get turned back off if you 
don't end all the backups that get started.

Here's a thought. What if we just stored the oldest starting LSN and a 
count of how many backups have been requested. When the backup ends it 
checks that backup count is > 0 and starting LSN is <= its starting LSN. 
If not, it throws an error. When backups go to 0 FPWs are turned off if 
they were off before the first backup.

That way, we could have a single function to cancel all backups in 
progress. Even if a new one started, one ending that was started prior 
to the cancel would know that it could not end successfully and error.

For the vast majority of people who have full page writes on, it 
wouldn't really matter how many backups were running so we could 
probably just skip all of that.

If we wanted to really make it more foolproof we could add a checksum to 
the backup label to make sure the user doesn't damage it before passing 
it back.

Regards,
-- 
-David
david@pgmasters.net



Re: Remove Deprecated Exclusive Backup Mode

From
Bruce Momjian
Date:
On Wed, Jul  1, 2020 at 10:24:15PM +0200, Magnus Hagander wrote:
> On Wed, Jul 1, 2020 at 9:59 PM Robert Haas <robertmhaas@gmail.com> wrote:
>     Yeah, I guess that's a pretty fair point. I have to confess to having
>     somewhat limited enthusiasm for adding a third mode here, but it might
>     be worth it.
> The intention is definitely not to have 3 modes. If we build this mode the
> intention is that it is strictly better than exclusive mode, and thus exclusive
> mode can finally be removed. (Whereas the nonexclusive mode is better in many
> ways, but not all)

Agreed.  I don't think three modes would help anyone.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: Remove Deprecated Exclusive Backup Mode

From
Bruce Momjian
Date:
On Wed, Jul  1, 2020 at 04:27:58PM -0400, David Steele wrote:
> On 7/1/20 3:58 PM, Robert Haas wrote:
> > If we could jigger things so that you don't need to stop the backup at
> > all, you only start it, and whether you ever finish copying everything
> > is something about which the system need not know or care, that would
> > be a lot nicer. I'm not sure I see how to do that, though.
> 
> Well, the only thing pg_stop_backup() *really* needs to know is the starting
> WAL position. pg_start_backup() gets that info so if it passes it back to
> pg_stop_backup() that could be enough. Or as was proposed above, it just

Doesn't pg_start_backup already return this?

    SELECT pg_start_backup('test');
     pg_start_backup
    -----------------
-->     0/2000028

> Here's a thought. What if we just stored the oldest starting LSN and a count
> of how many backups have been requested. When the backup ends it checks that
> backup count is > 0 and starting LSN is <= its starting LSN. If not, it
> throws an error. When backups go to 0 FPWs are turned off if they were off
> before the first backup.

Can't we just error out of an exclusive-style backup is requested to
start and a previous exclusive-style backup has not been stopped?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Wed, Jul 1, 2020 at 4:27 PM David Steele <david@pgmasters.net> wrote:
> Well, the only thing pg_stop_backup() *really* needs to know is the
> starting WAL position. pg_start_backup() gets that info so if it passes
> it back to pg_stop_backup() that could be enough. Or as was proposed
> above, it just passes the backup_label back to pg_stop_backup() for
> parsing. To write the .backup file to WAL (which I personally think is
> pretty useless) you'd need backup_label.
>
> The issue I have not been able to work around is that full page writes
> need to be on for backups and they won't get turned back off if you
> don't end all the backups that get started.
>
> Here's a thought. What if we just stored the oldest starting LSN and a
> count of how many backups have been requested. When the backup ends it
> checks that backup count is > 0 and starting LSN is <= its starting LSN.
> If not, it throws an error. When backups go to 0 FPWs are turned off if
> they were off before the first backup.
>
> That way, we could have a single function to cancel all backups in
> progress. Even if a new one started, one ending that was started prior
> to the cancel would know that it could not end successfully and error.
>
> For the vast majority of people who have full page writes on, it
> wouldn't really matter how many backups were running so we could
> probably just skip all of that.

Yeah, I like this line of thinking. Not sure about the details.

> If we wanted to really make it more foolproof we could add a checksum to
> the backup label to make sure the user doesn't damage it before passing
> it back.

I don't see a whole lot of point in that, honestly. It doesn't make it
impossible to damage the backup label; it just makes it harder. And
that doesn't seem worth much.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 7/1/20 4:27 PM, David Steele wrote:
> 
> Here's a thought. What if we just stored the oldest starting LSN and a 
> count of how many backups have been requested. When the backup ends it 
> checks that backup count is > 0 and starting LSN is <= its starting LSN. 
> If not, it throws an error. When backups go to 0 FPWs are turned off if 
> they were off before the first backup.

To be clear, the min starting LSN only needs to be reset when 
incrementing total running backups from 0 to 1. What we are recording 
here is the LSN since FPWs were enabled. As long as it is <= the start 
LSN of the backup when it ends then all is well (I think).

Regards,
-- 
-David
david@pgmasters.net



Re: Remove Deprecated Exclusive Backup Mode

From
Robert Haas
Date:
On Wed, Jul 1, 2020 at 4:29 PM Bruce Momjian <bruce@momjian.us> wrote:
> Agreed.  I don't think three modes would help anyone.

Well, I think that if and when we remove the existing exclusive mode,
we're going to break a bunch of people's backup scripts. I think it's
appropriate to do that eventually, but I'm not in a big rush. Long
deprecation periods are a feature, not a bug.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:
On Wed, Jul 1, 2020 at 10:28 PM David Steele <david@pgmasters.net> wrote:
On 7/1/20 3:58 PM, Robert Haas wrote:
> On Wed, Jul 1, 2020 at 3:50 PM Magnus Hagander <magnus@hagander.net> wrote:
>> As far as I've seen, the one thing that people have problems with in the exclusive mode backups are precisely the fact that they have to keep a persistent conneciton open, and thus it cannot work together with backup software that is limited to only supporting running a pre- and a post script.
>>
>> Something like I have suggested here is to solve *that* problem. I don't think anybody actually explicitly wants "exclusive backups" -- they want a backup solution that plugs into their world of pre/post scripts. And if we can make that one work in a safer way than the current exclusive backups, ohw is that not an improvement?
>
> Yeah, I guess that's a pretty fair point. I have to confess to having
> somewhat limited enthusiasm for adding a third mode here, but it might
> be worth it.
>
> It seems pretty well inevitable to me that people are going to forget
> to end them. I am not sure exactly what the consequences of that will
> be, but if for example there's a limited number of shared memory slots
> to store information about these backups, then if you leak any, you'll
> eventually run out of slots and your backups will start failing. I
> feel like that's a going to happen to about 75% of the people who try
> to use this new backup mode at some point in time, but maybe I'm a
> pessimist.[1]

I did consider going down this road and I came to more or less the same
conclusion. Backups would accumulate until we just ran out of space
and/or started throwing errors at some limit. Hardly very attractive.

Also, full pages writes will be left on indefinitely.

> If we could jigger things so that you don't need to stop the backup at
> all, you only start it, and whether you ever finish copying everything
> is something about which the system need not know or care, that would
> be a lot nicer. I'm not sure I see how to do that, though.

Well, the only thing pg_stop_backup() *really* needs to know is the
starting WAL position. pg_start_backup() gets that info so if it passes
it back to pg_stop_backup() that could be enough. Or as was proposed
above, it just passes the backup_label back to pg_stop_backup() for
parsing. To write the .backup file to WAL (which I personally think is
pretty useless) you'd need backup_label.

The issue I have not been able to work around is that full page writes
need to be on for backups and they won't get turned back off if you
don't end all the backups that get started.

Here's a thought. What if we just stored the oldest starting LSN and a
count of how many backups have been requested. When the backup ends it
checks that backup count is > 0 and starting LSN is <= its starting LSN.
If not, it throws an error. When backups go to 0 FPWs are turned off if
they were off before the first backup.

I guess the weak spot of that one is if some script does stop without doing start first, it will break somebody else's backup. (And yes, I've seen scripts make this mistake many times -- it equally breaks the exclusive backups in the current system...)

And don't we need the combination of the start/stop location for the history file?

--

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Jul 1, 2020 at 4:29 PM Bruce Momjian <bruce@momjian.us> wrote:
> > Agreed.  I don't think three modes would help anyone.
>
> Well, I think that if and when we remove the existing exclusive mode,
> we're going to break a bunch of people's backup scripts. I think it's
> appropriate to do that eventually, but I'm not in a big rush. Long
> deprecation periods are a feature, not a bug.

This is pretty tangential to the overall discussion, but I generally
disagree that deprecation periods beyond what we already, always,
provide thanks to supporting 5 major versions concurrently are actually
a feature or are healthy for the project.

I continue to find it curious that we stress a great deal over (very
likely) poorly written backup scripts that haven't been updated in the
5+? years since exclusive mode was deprecated, but we happily add new
keywords in new major versions and remove columns in our catalog tables
or rename columns or entirely redo how recovery works (breaking every
script out there that performed a restore..) or do any number of other
things that potentially break applications that communicate through the
PG protocol and other parts of the system that very likely have scripts
that were written to work with them.

I'm really of the opinion that we need to stop doing that.

If someone could explain what is so special about *this* part of the
system that we absolutely can't possibly accept any change that might
break user's scripts, and why it's worth all of the angst, maintenance,
ridiculously difficult documentation to understand and hacks (the
interface to pg_start_backup is ridiculously warty because of this), I'd
greatly appreciate it.

Thanks,

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Bruce Momjian
Date:
On Wed, Jul  1, 2020 at 04:36:03PM -0400, Robert Haas wrote:
> On Wed, Jul 1, 2020 at 4:29 PM Bruce Momjian <bruce@momjian.us> wrote:
> > Agreed.  I don't think three modes would help anyone.
> 
> Well, I think that if and when we remove the existing exclusive mode,
> we're going to break a bunch of people's backup scripts. I think it's
> appropriate to do that eventually, but I'm not in a big rush. Long
> deprecation periods are a feature, not a bug.

True, but we have had this marked as deprecated for a while.  Our
Postgres people who love backup technology want this improved, and our
operational people are asking for it to stay, so keeping the ability but
changing the API seems reasonable to me.  I tend to think our "love
backup technology" folks know more than our operational folks, even
though I am in the operational camp.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 7/1/20 4:39 PM, Magnus Hagander wrote:
> On Wed, Jul 1, 2020 at 10:28 PM David Steele <david@pgmasters.net 
>     Here's a thought. What if we just stored the oldest starting LSN and a
>     count of how many backups have been requested. When the backup ends it
>     checks that backup count is > 0 and starting LSN is <= its starting
>     LSN.
>     If not, it throws an error. When backups go to 0 FPWs are turned off if
>     they were off before the first backup.
> 
> I guess the weak spot of that one is if some script does stop without 
> doing start first, it will break somebody else's backup. (And yes, I've 
> seen scripts make this mistake many times -- it equally breaks the 
> exclusive backups in the current system...)

Well, they'd have to pass in a backup_label with a start LSN >= the min 
LSN or they would just get an error and not decrement the backup count.

The real issue would be if they called pg_stop_backup twice. We might be 
able to stop that with a rolling max stop lsn to keep anyone from 
calling pg_stop_backup() twice.

But yeah, it would be possible to kill somebody else's session with some 
finagling. Still, worse case would be an error'd backup rather than a 
corrupt one.

But really, that's only if FPWs are turned off. We can also do some 
extra validation if the session is left open, which for most software is 
the norm now.

> And don't we need the combination of the start/stop location for the 
> history file?

You mean the .backup file for the WAL? All that needs is the 
backup_label and the stop LSN that's determined in pg_stop_backup(). Am 
I missing something?

Regards,
-- 
-David
david@pgmasters.net



Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 7/1/20 4:32 PM, Bruce Momjian wrote:
> On Wed, Jul  1, 2020 at 04:27:58PM -0400, David Steele wrote:
>> On 7/1/20 3:58 PM, Robert Haas wrote:
>>> If we could jigger things so that you don't need to stop the backup at
>>> all, you only start it, and whether you ever finish copying everything
>>> is something about which the system need not know or care, that would
>>> be a lot nicer. I'm not sure I see how to do that, though.
>>
>> Well, the only thing pg_stop_backup() *really* needs to know is the starting
>> WAL position. pg_start_backup() gets that info so if it passes it back to
>> pg_stop_backup() that could be enough. Or as was proposed above, it just
> 
> Doesn't pg_start_backup already return this?
> 
>     SELECT pg_start_backup('test');
>      pg_start_backup
>     -----------------
> -->     0/2000028

It does, though after looking at the code I think we need the entire 
backup_label. For any of this to be useful, pg_start_backup() *must* 
return backup_label anyway so it can be stored somewhere the backup 
software can find it.

>> Here's a thought. What if we just stored the oldest starting LSN and a count
>> of how many backups have been requested. When the backup ends it checks that
>> backup count is > 0 and starting LSN is <= its starting LSN. If not, it
>> throws an error. When backups go to 0 FPWs are turned off if they were off
>> before the first backup.
> 
> Can't we just error out of an exclusive-style backup is requested to
> start and a previous exclusive-style backup has not been stopped?

That means only allowing one exclusive backup but it's not like we don't 
already have that limitation. But -- I'd rather not even need to specify 
exclusive/non-exclusive when the backup is started. Basically, all 
backups would be non-exclusive but we'd allow pg_stop_backup() to be 
called in a new session (or the same session).

Regards,
-- 
-David
david@pgmasters.net



Re: Remove Deprecated Exclusive Backup Mode

From
Bruce Momjian
Date:
On Wed, Jul  1, 2020 at 05:20:16PM -0400, David Steele wrote:
> On 7/1/20 4:32 PM, Bruce Momjian wrote:
> > On Wed, Jul  1, 2020 at 04:27:58PM -0400, David Steele wrote:
> > > On 7/1/20 3:58 PM, Robert Haas wrote:
> > > > If we could jigger things so that you don't need to stop the backup at
> > > > all, you only start it, and whether you ever finish copying everything
> > > > is something about which the system need not know or care, that would
> > > > be a lot nicer. I'm not sure I see how to do that, though.
> > > 
> > > Well, the only thing pg_stop_backup() *really* needs to know is the starting
> > > WAL position. pg_start_backup() gets that info so if it passes it back to
> > > pg_stop_backup() that could be enough. Or as was proposed above, it just
> > 
> > Doesn't pg_start_backup already return this?
> > 
> >     SELECT pg_start_backup('test');
> >      pg_start_backup
> >     -----------------
> > -->     0/2000028
> 
> It does, though after looking at the code I think we need the entire
> backup_label. For any of this to be useful, pg_start_backup() *must* return
> backup_label anyway so it can be stored somewhere the backup software can
> find it.

Sure, sounds good to me.

> > > Here's a thought. What if we just stored the oldest starting LSN and a count
> > > of how many backups have been requested. When the backup ends it checks that
> > > backup count is > 0 and starting LSN is <= its starting LSN. If not, it
> > > throws an error. When backups go to 0 FPWs are turned off if they were off
> > > before the first backup.
> > 
> > Can't we just error out of an exclusive-style backup is requested to
> > start and a previous exclusive-style backup has not been stopped?
> 
> That means only allowing one exclusive backup but it's not like we don't
> already have that limitation. But -- I'd rather not even need to specify
> exclusive/non-exclusive when the backup is started. Basically, all backups
> would be non-exclusive but we'd allow pg_stop_backup() to be called in a new
> session (or the same session).

OK, however you want.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:


On Wed, Jul 1, 2020 at 11:08 PM David Steele <david@pgmasters.net> wrote:
On 7/1/20 4:39 PM, Magnus Hagander wrote:
> On Wed, Jul 1, 2020 at 10:28 PM David Steele <david@pgmasters.net
>     Here's a thought. What if we just stored the oldest starting LSN and a
>     count of how many backups have been requested. When the backup ends it
>     checks that backup count is > 0 and starting LSN is <= its starting
>     LSN.
>     If not, it throws an error. When backups go to 0 FPWs are turned off if
>     they were off before the first backup.
>
> I guess the weak spot of that one is if some script does stop without
> doing start first, it will break somebody else's backup. (And yes, I've
> seen scripts make this mistake many times -- it equally breaks the
> exclusive backups in the current system...)

Well, they'd have to pass in a backup_label with a start LSN >= the min
LSN or they would just get an error and not decrement the backup count.

Oh as long as we're still requiring pg_stop_backup() to pass in something that it received from pg_start_backup() so that we can verify it's the correct one, then that problem wouldn't exist.


The real issue would be if they called pg_stop_backup twice. We might be
able to stop that with a rolling max stop lsn to keep anyone from
calling pg_stop_backup() twice. 

But yeah, it would be possible to kill somebody else's session with some
finagling. Still, worse case would be an error'd backup rather than a
corrupt one.

What about the case of:
Session A - start backup
Session B - stop backup (but A is still running of course)
Session C - start backup
Session A - stop backup

At this point, session A can still stop the backup because there is one running -- but there has been time in between the two when no backup was running. That could lead to Session A getting a corrupt backup, I think -- unless we pass some unique identifier back in pg_stop_backup that matches it up. (And if we do pass that up, then session B running pg_stop_backup() would fail, thus leaving the backup started by A still running.


But really, that's only if FPWs are turned off. We can also do some
extra validation if the session is left open, which for most software is
the norm now.

Session left open should really still be the default, as it's the safest one :) And yes, most backup *software* does it. But the entire reason we want another mode from the current non-exclusive backup is people *not* using one of the ready-made backup solutions.



> And don't we need the combination of the start/stop location for the
> history file?

You mean the .backup file for the WAL? All that needs is the
backup_label and the stop LSN that's determined in pg_stop_backup(). Am
I missing something?

I mean the .backup file in the archive, yes. That one contains both the start and the stop location, timeline and time.
 
--

Re: Remove Deprecated Exclusive Backup Mode

From
Jehan-Guillaume de Rorthais
Date:
On Wed, 1 Jul 2020 15:58:57 -0400
Robert Haas <robertmhaas@gmail.com> wrote:

> On Wed, Jul 1, 2020 at 3:50 PM Magnus Hagander <magnus@hagander.net> wrote:
> > As far as I've seen, the one thing that people have problems with in the
> > exclusive mode backups are precisely the fact that they have to keep a
> > persistent conneciton open, and thus it cannot work together with backup
> > software that is limited to only supporting running a pre- and a post
> > script.
> >
> > Something like I have suggested here is to solve *that* problem. I don't
> > think anybody actually explicitly wants "exclusive backups" -- they want a
> > backup solution that plugs into their world of pre/post scripts. And if we
> > can make that one work in a safer way than the current exclusive backups,
> > ohw is that not an improvement?  
> 
> Yeah, I guess that's a pretty fair point. I have to confess to having
> somewhat limited enthusiasm for adding a third mode here, but it might
> be worth it.
> 
> It seems pretty well inevitable to me that people are going to forget
> to end them.

There's so many way an admin could break their backup process and not notice
it. Even with current nonexclusive backup, a bad written script might creates
bad unrecoverable backups. 

In regard with your concern, monitoring in progress backups might be *one*
answer, from server side. This was "easy" with exclusive backup, we just
monitor the backup_label age and warn if it is older than expected [1]. Using
non-exclusive backup...this is not that easy anymore. And pg_is_in_backup() is
quite misleading if the admin found it without reading doc. Maybe an admin
function to list in progress non-exclusive backup and related backend pid might
be a good start?

With such admin function facilities, client side process could monitor backup
and decide to cancel them based on some internal backup policies/process.

Tools like pg_basebackup (and probably pgbackrest/barman to some extends) still
need the backup to abort on disconnection. Maybe it could flag its session
using the replication protocol or call a new admin function or load a hook on
session-shutdown to keep previous API behavior?

Regards,

[1] https://github.com/OPMDG/check_pgactivity/#user-content-backup_label_age-8.1



Re: Remove Deprecated Exclusive Backup Mode

From
Laurenz Albe
Date:
On Wed, 2020-07-01 at 16:46 -0400, Stephen Frost wrote:
> I continue to find it curious that we stress a great deal over (very
> likely) poorly written backup scripts that haven't been updated in the
> 5+? years since exclusive mode was deprecated, but we happily add new
> keywords in new major versions and remove columns in our catalog tables
> or rename columns or entirely redo how recovery works (breaking every
> script out there that performed a restore..) or do any number of other
> things that potentially break applications that communicate through the
> PG protocol and other parts of the system that very likely have scripts
> that were written to work with them.
> 
> I'm really of the opinion that we need to stop doing that.
> 
> If someone could explain what is so special about *this* part of the
> system that we absolutely can't possibly accept any change that might
> break user's scripts, and why it's worth all of the angst, maintenance,
> ridiculously difficult documentation to understand and hacks (the
> interface to pg_start_backup is ridiculously warty because of this), I'd
> greatly appreciate it.

Easy.  Lots of people write backup scripts, and fewer people write
complicated pg_catalog queries.  We would make way more users unhappy.

Besides, there is nothing "poorly written" about a backup script using
exclusive backup as long as the user is aware that there could be some
small degree of trouble in case of a crash.

You have a point about reworking the way recovery works: after teaching
several classes and running into oddities with recovery parameters in
postgresql.conf, I am not sure if that was a good move.
Particularly the necessity for "recovery.signal" makes the New Way no
easier than the Old Way - perhaps something like "pg_ctl start_in_recovery"
would have been smarter.

But one instance where we made users unhappy is not justification
for making them unhappy again.

Yours,
Laurenz Albe




Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:


On Thu, Jul 2, 2020 at 1:06 AM Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:
On Wed, 1 Jul 2020 15:58:57 -0400
Robert Haas <robertmhaas@gmail.com> wrote:

> On Wed, Jul 1, 2020 at 3:50 PM Magnus Hagander <magnus@hagander.net> wrote:
> > As far as I've seen, the one thing that people have problems with in the
> > exclusive mode backups are precisely the fact that they have to keep a
> > persistent conneciton open, and thus it cannot work together with backup
> > software that is limited to only supporting running a pre- and a post
> > script.
> >
> > Something like I have suggested here is to solve *that* problem. I don't
> > think anybody actually explicitly wants "exclusive backups" -- they want a
> > backup solution that plugs into their world of pre/post scripts. And if we
> > can make that one work in a safer way than the current exclusive backups,
> > ohw is that not an improvement? 
>
> Yeah, I guess that's a pretty fair point. I have to confess to having
> somewhat limited enthusiasm for adding a third mode here, but it might
> be worth it.
>
> It seems pretty well inevitable to me that people are going to forget
> to end them.

There's so many way an admin could break their backup process and not notice
it. Even with current nonexclusive backup, a bad written script might creates
bad unrecoverable backups.

Definitely. And the scariest part of them is that they all work fine in testing...



In regard with your concern, monitoring in progress backups might be *one*
answer, from server side. This was "easy" with exclusive backup, we just
monitor the backup_label age and warn if it is older than expected [1]. Using

Yeah, but having a monitoring point that will crash your database on restart isn't very safe :)

 
non-exclusive backup...this is not that easy anymore. And pg_is_in_backup() is
quite misleading if the admin found it without reading doc. Maybe an admin

Yeah, as it is now it should really be called pg_is_in_exclusive_backup().

 
function to list in progress non-exclusive backup and related backend pid might
be a good start?

I think it would make perfect sense to show manual (exclusive or non-exclusive) base backups in pg_stat_progress_basebackup. There's no fundamental need that one should only be for base backups taken with pg_basebackup. In fact, I'd argue that's a mistake in the view in v13 that it does not include this information.

It could have "phase" set to "manual non-exclusive" for example, and leave the other fields at NULL.

I guess in theory even for exclusive ones, with the pid column set to NULL. (As Stephen mentioned at some point in the future we might also want to extend it to allow these tools to report their progress as well into it, probably by just calling an admin function on the connection that they already have).


Tools like pg_basebackup (and probably pgbackrest/barman to some extends) still
need the backup to abort on disconnection. Maybe it could flag its session
using the replication protocol or call a new admin function or load a hook on
session-shutdown to keep previous API behavior?

Suddenly requiring a replication protocol connection for one thing, when all their other things are done over a normal one, seems really terrible.  But having an admin function would work.

But anything requiring loading of a hook is going to raise the bar massively as suddenly somebody needs to install a compiled binary on the server in order to use it. But calling a separate function is pretty much what I suggested upthread (except I suggested a new version of pg_stop_backup, but implementation wise)

But I'm not sure what previous API behavior you're looking to preserve here?

--

Re: Remove Deprecated Exclusive Backup Mode

From
Magnus Hagander
Date:


On Thu, Jul 2, 2020 at 10:35 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Wed, 2020-07-01 at 16:46 -0400, Stephen Frost wrote:
> I continue to find it curious that we stress a great deal over (very
> likely) poorly written backup scripts that haven't been updated in the
> 5+? years since exclusive mode was deprecated, but we happily add new
> keywords in new major versions and remove columns in our catalog tables
> or rename columns or entirely redo how recovery works (breaking every
> script out there that performed a restore..) or do any number of other
> things that potentially break applications that communicate through the
> PG protocol and other parts of the system that very likely have scripts
> that were written to work with them.
>
> I'm really of the opinion that we need to stop doing that.
>
> If someone could explain what is so special about *this* part of the
> system that we absolutely can't possibly accept any change that might
> break user's scripts, and why it's worth all of the angst, maintenance,
> ridiculously difficult documentation to understand and hacks (the
> interface to pg_start_backup is ridiculously warty because of this), I'd
> greatly appreciate it.

Easy.  Lots of people write backup scripts, and fewer people write
complicated pg_catalog queries.  We would make way more users unhappy.

It's not about complicated catalog queries.

For example, in PostgreSQL 13 we break what I would say is the vast majority of every query against pg_stat_statements, by renaming those fields. Yet the old ones were never deprecated, and certainly not for 5+ years. And I certainly *hope* we have more users who query pg_stat_statements than write their own backup solutions.

In PostgreSQL 12, we broke every single restore script by moving backup configuration into recovery.conf. Yet the old ones were never deprecated, and certainly not for 5+ years.

Or for PostgreSQL 10, we renamed pg_xlog to pg_wal, and pg_log to log, both which would break the vast majority of existing backup scripts. Yet the old ones were never deprecated, and certainly not for 5+ years.

In general, we have made backwards incompatible changes in almost every release of PostgreSQL. And the vast majority of them have *not* been deprecated for years ahead of making the change -- we just killed them.

So I do agree with Stephen on this -- it seems we are holding this old and if not broken then at least very *fragile* API to a *very* different set of standards than our other functionality.

All backup scripts had to be rewritten in 10. All restore scripts had to be rewritten in 12. What's so special about requiring backup scripts to be rewritten again in 14?

Now, I understand it if we were removing functionality that you could not rewrite to make things work. Heck, I can even at least partially understand that non-exclusive backup mode is too *difficult* for some people to use. But the currently suggested changes are there specifically to make it as easy to use as the deprecated one, yet a lot safer.


Besides, there is nothing "poorly written" about a backup script using
exclusive backup as long as the user is aware that there could be some
small degree of trouble in case of a crash.

It's not a poorly written script, but it *is* a poorly designed API. But if we decide we can never fix things that we have designed poorly, then what other parts of our development should we stop?

(And no, I don't agree it's a "small degree of trouble". At all. Because usually you end up with a total system outage sometime in the middle of the night (when most people run their backups), blocking end user access, waking up on-call personnel, and dealing with a pretty damn major incident)


But one instance where we made users unhappy is not justification
for making them unhappy again.

This is true.

So which users will actually be unhappy about this?

Yes, they have to make small updates to their scripts. Again, if they don't want that, they can *never* upgrade PostgreSQL, because we make such changes in *every* major release. Those are small, and those are trivial (with the suggested changes, not today). In return, they get a database that won't crash on them if something happens during backup.

--

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 7/1/20 5:44 PM, Magnus Hagander wrote:
> On Wed, Jul 1, 2020 at 11:08 PM David Steele <david@pgmasters.net 
> <mailto:david@pgmasters.net>> wrote:
> 
>     But yeah, it would be possible to kill somebody else's session with
>     some
>     finagling. Still, worse case would be an error'd backup rather than a
>     corrupt one.
> 
> What about the case of:
> Session A - start backup
> Session B - stop backup (but A is still running of course)
> Session C - start backup
> Session A - stop backup
> 
> At this point, session A can still stop the backup because there is one 
> running -- but there has been time in between the two when no backup was 
> running. That could lead to Session A getting a corrupt backup, I think 
> -- unless we pass some unique identifier back in pg_stop_backup that 
> matches it up. (And if we do pass that up, then session B running 
> pg_stop_backup() would fail, thus leaving the backup started by A still 
> running.

This is fine because the min start LSN would have been advanced after B 
stopped. When A tries to stop the min start LSN will be later than its 
start LSN so it will error.

It might be easier/better to just keep the one exclusive slot in shared 
memory and store the backup label in it. We only allow one exclusive 
backup now so it wouldn't be a loss in functionality.

None of this really solves the problem of what happens when the user 
dumps the backup_label into the data directory. With traditional backup 
software that's pretty much going to be the only choice. Is telling them 
not to do it and washing our hands of it really enough?

In particular, I'm worried about the logic in postmaster.c that would be 
removed if we no longer save the backup_label explicitly during an 
exclusive backup. If backup_label is no longer removed on a clean 
shutdown it seems we'll just make the situation worse.

Regards,
-- 
-David
david@pgmasters.net



Re: Remove Deprecated Exclusive Backup Mode

From
Jehan-Guillaume de Rorthais
Date:
On Thu, 2 Jul 2020 12:32:14 +0200
Magnus Hagander <magnus@hagander.net> wrote:
[...]
> > non-exclusive backup...this is not that easy anymore. And
> > pg_is_in_backup() is quite misleading if the admin found it without reading
> > doc. Maybe an admin 
> 
> Yeah, as it is now it should really be called pg_is_in_exclusive_backup().

+1
We end up to the same conclusion while discussing with Gilles Darold this
morning. But considering the discussion bellow, this might not be needed anyway.

> > function to list in progress non-exclusive backup and related backend pid
> > might
> > be a good start?
> >  
> 
> I think it would make perfect sense to show manual (exclusive or
> non-exclusive) base backups in pg_stat_progress_basebackup. There's no
> fundamental need that one should only be for base backups taken with
> pg_basebackup. In fact, I'd argue that's a mistake in the view in v13 that
> it does not include this information.
> 
> It could have "phase" set to "manual non-exclusive" for example, and leave
> the other fields at NULL.

Even in manual non-exclusive we could define some different phase:

 * waiting for checkpoint to finish
 * external backup
 * waiting for wal archiving to finish

So maybe exposing the backup label and/or the method might be useful as well?

> I guess in theory even for exclusive ones, with the pid column set to NULL.

We are discussing removing the mandatory connection during the non-exclusive
backup. If it become optional at some point, this PID might be NULL as well...

> (As Stephen mentioned at some point in the future we might also want to
> extend it to allow these tools to report their progress as well into it,
> probably by just calling an admin function on the connection that they
> already have).
> 
> 
> > Tools like pg_basebackup (and probably pgbackrest/barman to some extends)
> > still need the backup to abort on disconnection. Maybe it could flag its
> > session using the replication protocol or call a new admin function or load
> > a hook on session-shutdown to keep previous API behavior?
> 
> Suddenly requiring a replication protocol connection for one thing, when
> all their other things are done over a normal one, seems really terrible.

Sorry, I misexplained. Adding such flag to the replication protocol was only for
pg_basebackup. Other tools would use an admin function to achieve the same goal.

> But having an admin function would work.
> 
> But anything requiring loading of a hook is going to raise the bar
> massively as suddenly somebody needs to install a compiled binary on the
> server in order to use it.

Contrib already provide many useful tools. How would it be different from eg.
pg_archivecleanup? IIRC, end of session hook was discussed in the past, but I
did not follow the thread by the time. But anyway, an admin function would
probably make the job as well anyway.

> But calling a separate function is pretty much > what I suggested upthread
> (except I suggested a new version of pg_stop_backup, but implementation wise)
> 
> But I'm not sure what previous API behavior you're looking to preserve here?

Current non-exclusive backup are automatically aborted when the
pg_start_backup() session disappear without pg_stop_backup(). I suppose this was
the API behavior that Robert is concerned to loose when he is writing about
people forgetting to end backups.

And now you are talking about Stephen's idea to report the progress of a backup,
this kind of API could be used as watchdog as well: if the frontend did not
report before some optional kind of (user defined) timeout, abort the backup.
We would have a "similar" behavior than current one.

To be clear, I suppose these admin functions could be called from any backend
session, taking some kind of backup id (eg. the token you were talking about
upthread).

Regards,



Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> On Wed, 2020-07-01 at 16:46 -0400, Stephen Frost wrote:
> > If someone could explain what is so special about *this* part of the
> > system that we absolutely can't possibly accept any change that might
> > break user's scripts, and why it's worth all of the angst, maintenance,
> > ridiculously difficult documentation to understand and hacks (the
> > interface to pg_start_backup is ridiculously warty because of this), I'd
> > greatly appreciate it.
>
> Easy.  Lots of people write backup scripts, and fewer people write
> complicated pg_catalog queries.  We would make way more users unhappy.

And no one writes scripts to do restores?  No, I don't believe that
(also don't believe that no one writes scripts or queries that use
various catalog tables).  Also, we certainly do get complaints from time
to time from people who got hit by a catalog change, but it's usually
"this stopped working" and you tell them "do this instead" and they're
happy to update their code.  That's why we have major versions, and why
we have back-branches and why we support a given major version for 5
years.

> You have a point about reworking the way recovery works: after teaching
> several classes and running into oddities with recovery parameters in
> postgresql.conf, I am not sure if that was a good move.

If it was a good idea or bad idea is really pretty independent of the
point I was trying to make: we *did* completely change the restore API
without any deprecation period for the old one or period of time where
we supported both or any of the foolishness that we've been doing for
the backup API for the past however many years.  Perhaps the mob of
users who are going to complain about that change hasn't decided to
upgrade to v12 quite yet, but I doubt it.

I'll say it again- I really think we need to stop this special case
where we've decided that we can't possibly make a breaking change to the
backup API even across major versions.  We don't do that, evidently, in
hardly any other cases and it's an altogether good thing- we don't have
ridiculously complicated restore documentation that has to explain two
completely different ways of doing a restore, nor is the code
complicated by having different ways, etc..

Thanks,

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
Stephen Frost
Date:
Greetings,

* Magnus Hagander (magnus@hagander.net) wrote:
> On Thu, Jul 2, 2020 at 1:06 AM Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
> wrote:
> > function to list in progress non-exclusive backup and related backend pid
> > might
> > be a good start?
>
> I think it would make perfect sense to show manual (exclusive or
> non-exclusive) base backups in pg_stat_progress_basebackup. There's no
> fundamental need that one should only be for base backups taken with
> pg_basebackup. In fact, I'd argue that's a mistake in the view in v13 that
> it does not include this information.

I agree entirely that it was a mistake in v13 to not include this- and
to not include a way for other backup tools to report their progress.
This is a good example of why we really need an in-core non-streamed
backup capability, imv, because if we don't we end up with things like
this that are just thinking about streaming basebackups.  We also have
no in-core code that is user-facing that exercises the low-level backup
API.

> It could have "phase" set to "manual non-exclusive" for example, and leave
> the other fields at NULL.

Yeah.

> I guess in theory even for exclusive ones, with the pid column set to NULL.
> (As Stephen mentioned at some point in the future we might also want to
> extend it to allow these tools to report their progress as well into it,
> probably by just calling an admin function on the connection that they
> already have).

Right, that wouldn't have been hard to include and would have been quite
nice.

Thanks,

Stephen

Attachment

Re: Remove Deprecated Exclusive Backup Mode

From
David Steele
Date:
On 7/2/20 7:12 AM, David Steele wrote:
> 
> None of this really solves the problem of what happens when the user 
> dumps the backup_label into the data directory. With traditional backup 
> software that's pretty much going to be the only choice. Is telling them 
> not to do it and washing our hands of it really enough?

After some thought I believe there are several recommendations we could 
give in the docs about what to do with the backup_label file.

This patch, though, looks to be a dead end. I think the best thing to do 
is withdraw the patch and end the thread.

I'll start a new thread when I have a patch/design that implements the 
new ideas we have discussed.

Regards,
-- 
-David
david@pgmasters.net