Thread: Add exclusive backup deprecation notes to documentation

Add exclusive backup deprecation notes to documentation

From
David Steele
Date:
Hackers,

It has been noted on multiple threads, such as [1], that it would be 
good to have additional notes in the documentation to explain why 
exclusive backups have been deprecated and why they should be avoided 
when possible.

This patch attempts to document the limitations of the exclusive mode.

-- 
-David
david@pgmasters.net

[1] 
https://www.postgresql.org/message-id/flat/ac7339ca-3718-3c93-929f-99e725d1172c%40pgmasters.net

Attachment

Re: Add exclusive backup deprecation notes to documentation

From
Laurenz Albe
Date:
David Steele wrote:
> This patch attempts to document the limitations of the exclusive mode.

Thanks!

> +   <para>
> +     The primary issue with the exclusive method is that the
> +     <filename>backup_label</filename> file is written into the data directory
> +     when <function>pg_start_backup</function> is called and remains until
> +     <function>pg_stop_backup</function> is called.  If
> +     <productname>PostgreSQL</productname> or the host terminates abnormally

There should be a comma at the end of this line.

> +     then <filename>backup_label</filename> will be left in the data directory
> +     and <productname>PostgreSQL</productname> will not start. A log message

You should say "*may* not start", because it will if the WAL segment is still there.

> +     recommends that <filename>backup_label</filename> be removed if not
> +     restoring from a backup.
> +   </para>

Yours,
Laurenz Albe



Re: Add exclusive backup deprecation notes to documentation

From
David Steele
Date:
On 2/28/19 6:08 PM, Laurenz Albe wrote:
> David Steele wrote:
>> This patch attempts to document the limitations of the exclusive mode.
> 
> Thanks!
> 
>> +   <para>
>> +     The primary issue with the exclusive method is that the
>> +     <filename>backup_label</filename> file is written into the data directory
>> +     when <function>pg_start_backup</function> is called and remains until
>> +     <function>pg_stop_backup</function> is called.  If
>> +     <productname>PostgreSQL</productname> or the host terminates abnormally
> 
> There should be a comma at the end of this line.

Fixed.

>> +     then <filename>backup_label</filename> will be left in the data directory
>> +     and <productname>PostgreSQL</productname> will not start. A log message
> 
> You should say "*may* not start", because it will if the WAL segment is still there.

You are correct.  It's still pretty likely though so I went with "probably".

I added some extra language to the warning that gets emitted in the log. 
  Users are more like to see that than the documentation.

I also addressed a comment from another thread by adding pg_basebackup 
as .e.g. rather than or.

Thanks,
-- 
-David
david@pgmasters.net

Attachment

Re: Add exclusive backup deprecation notes to documentation

From
Laurenz Albe
Date:
David Steele wrote:
> I added some extra language to the warning that gets emitted in the log. 
>   Users are more like to see that than the documentation.
> 
> I also addressed a comment from another thread by adding pg_basebackup 
> as .e.g. rather than or.

Looks good to me.

Yours,
Laurenz Albe



Re: Add exclusive backup deprecation notes to documentation

From
Martín Marqués
Date:
El vie., 1 de mar. de 2019 a la(s) 06:21, David Steele
(david@pgmasters.net) escribió:
>
>
> I also addressed a comment from another thread by adding pg_basebackup
> as .e.g. rather than or.

Thanks David,

This looks very good!

--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: Add exclusive backup deprecation notes to documentation

From
Martín Marqués
Date:
El vie., 1 de mar. de 2019 a la(s) 06:21, David Steele
(david@pgmasters.net) escribió:
>
> I added some extra language to the warning that gets emitted in the log.
>   Users are more like to see that than the documentation.
>
> I also addressed a comment from another thread by adding pg_basebackup
> as .e.g. rather than or.

More awake, I gave this last patch a second read. Wording is good now.
No objections there at all.

I do think that paragraph 2 and 3 should be merged as it seems the
idea on the third is a continuation of what's in the second.

But even without that change, I believe this patch is good for commit.

Regards,

--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: Add exclusive backup deprecation notes to documentation

From
Peter Eisentraut
Date:
Please follow the 1-space indentation in the documentation files.

I think the style of mentioning all the problems in a note before the
actual description is a bit backwards.

The layout of the section should be:

- This is what it does.

- Here are some comparisons with other methods.

- For this or that reason, it's deprecated.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Add exclusive backup deprecation notes to documentation

From
David Steele
Date:
On 3/1/19 1:13 PM, Peter Eisentraut wrote:
> Please follow the 1-space indentation in the documentation files.

Whoops.  Will fix.

> I think the style of mentioning all the problems in a note before the
> actual description is a bit backwards.

In the case of an important note like this I think it should be right at 
the top where people will see it.  Not everyone reads to the end.

-- 
-David
david@pgmasters.net


Re: Add exclusive backup deprecation notes to documentation

From
Magnus Hagander
Date:
On Fri, Mar 1, 2019 at 2:07 PM David Steele <david@pgmasters.net> wrote:
On 3/1/19 1:13 PM, Peter Eisentraut wrote:
> Please follow the 1-space indentation in the documentation files.

Whoops.  Will fix.

> I think the style of mentioning all the problems in a note before the
> actual description is a bit backwards.

In the case of an important note like this I think it should be right at
the top where people will see it.  Not everyone reads to the end

Maybe have the first note say "This method is deprecated bceause it has serious risks (see bellow)" and then list the actual risks at the end? 

--

Re: Add exclusive backup deprecation notes to documentation

From
Laurenz Albe
Date:
Magnus Hagander wrote:
> Maybe have the first note say "This method is deprecated bceause it has serious
> risks (see bellow)" and then list the actual risks at the end? 

Good idea.  That may attract the attention of the dogs among the readers.

Yours,
Laurenz Albe



Re: Add exclusive backup deprecation notes to documentation

From
David Steele
Date:
On 3/1/19 3:14 PM, Laurenz Albe wrote:
> Magnus Hagander wrote:
>> Maybe have the first note say "This method is deprecated bceause it has serious
>> risks (see bellow)" and then list the actual risks at the end?
> 
> Good idea.  That may attract the attention of the dogs among the readers.

OK, here's a new version that splits the deprecation notes from the 
discussion of risks.  I also fixed the indentation.

Thanks,
-- 
-David
david@pgmasters.net

Attachment

Re: Add exclusive backup deprecation notes to documentation

From
Michael Paquier
Date:
On Thu, Mar 07, 2019 at 11:33:20AM +0200, David Steele wrote:
> OK, here's a new version that splits the deprecation notes from the
> discussion of risks.  I also fixed the indentation.

The documentation part looks fine to me.  Just one nit regarding the
error hint.

> -    errhint("If you are not restoring from a backup, try removing the file \"%s/backup_label\".", DataDir)));
> +    errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add recovery options to
\"%s/postgresql.auto.conf\".\n"

Here do we really want to recommend adding options to
postgresql.auto.conf?  This depends a lot on the solution integration
so I think that this hint could actually confuse some users because it
implies that they kind of *have* to do so, which is not correct.  I
would recommend to be a bit more generic and just use "and add
necessary recovery configuration".

> +        "If you are not restoring from a backup, try removing the file \"%s/backup_label\".\n"
> +        "Be careful: removing \"%s/backup_label\" will result in a corrupt cluster if restoring from a backup.",

Fine for these two ones.

> +        DataDir, DataDir, DataDir, DataDir)));

:)
--
Michael

Attachment

Re: Add exclusive backup deprecation notes to documentation

From
Magnus Hagander
Date:
On Thu, Mar 7, 2019 at 5:35 PM Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Mar 07, 2019 at 11:33:20AM +0200, David Steele wrote:
> OK, here's a new version that splits the deprecation notes from the
> discussion of risks.  I also fixed the indentation.

The documentation part looks fine to me.  Just one nit regarding the
error hint.

> -     errhint("If you are not restoring from a backup, try removing the file \"%s/backup_label\".", DataDir)));
> +     errhint("If you are restoring from a backup, touch \"%s/recovery.signal\" and add recovery options to \"%s/postgresql.auto.conf\".\n"

Here do we really want to recommend adding options to
postgresql.auto.conf?  This depends a lot on the solution integration
so I think that this hint could actually confuse some users because it
implies that they kind of *have* to do so, which is not correct.  I
would recommend to be a bit more generic and just use "and add
necessary recovery configuration".

Agreed, I think we should never tell people to "add recovery options to postgresql.auto.conf". Becuase they should never do that manually. If we want to suggest people use postgresql.auto.conf surely they should be using ALTER SYSTEM SET. Which of course doesn't work in this case, since postgrsql isn't running yet.

So yeah either that, or say "add to postgresql.conf" without the auto?

--

Re: Add exclusive backup deprecation notes to documentation

From
Peter Eisentraut
Date:
On 2019-03-07 10:33, David Steele wrote:
> On 3/1/19 3:14 PM, Laurenz Albe wrote:
>> Magnus Hagander wrote:
>>> Maybe have the first note say "This method is deprecated bceause it has serious
>>> risks (see bellow)" and then list the actual risks at the end?
>>
>> Good idea.  That may attract the attention of the dogs among the readers.
> 
> OK, here's a new version that splits the deprecation notes from the 
> discussion of risks.  I also fixed the indentation.

The documentation changes appear to continue the theme from the other
thread that the exclusive backup mode is terrible and everyone should
feel bad about it.  I don't think there is consensus about that.

I do welcome a more precise description of the handling of backup_label
and a better hint in the error message.  I think we haven't gotten to
the final shape there yet, especially for the latter.  I suggest to
focus on that.

The other changes repeat points already made in nearby documentation.

I think it would be helpful to frame the documentation in a way to
suggest that the nonexclusive mode is more for automation and more
sophisticated tools and the exclusive mode is more for manual or simple
scripted use.

If we do think that the exclusive mode will be removed in PG13, then I
don't think we need further documentation changes.  It already says it's
deprecated, and we don't need to justify that at length.  But again, I'm
not convinced that that will happen.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Add exclusive backup deprecation notes to documentation

From
Robert Haas
Date:
On Mon, Mar 18, 2019 at 8:33 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> The documentation changes appear to continue the theme from the other
> thread that the exclusive backup mode is terrible and everyone should
> feel bad about it.  I don't think there is consensus about that.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Add exclusive backup deprecation notes to documentation

From
Stephen Frost
Date:
Greetings,

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> On 2019-03-07 10:33, David Steele wrote:
> > On 3/1/19 3:14 PM, Laurenz Albe wrote:
> >> Magnus Hagander wrote:
> >>> Maybe have the first note say "This method is deprecated bceause it has serious
> >>> risks (see bellow)" and then list the actual risks at the end?
> >>
> >> Good idea.  That may attract the attention of the dogs among the readers.
> >
> > OK, here's a new version that splits the deprecation notes from the
> > discussion of risks.  I also fixed the indentation.
>
> The documentation changes appear to continue the theme from the other
> thread that the exclusive backup mode is terrible and everyone should
> feel bad about it.  I don't think there is consensus about that.

I don't view it as up for much debate.  The exclusive backup mode is
quite bad.

> I do welcome a more precise description of the handling of backup_label
> and a better hint in the error message.  I think we haven't gotten to
> the final shape there yet, especially for the latter.  I suggest to
> focus on that.

There isn't a way to handle the backup_label in a sane way when it's
created by the server in the data directory, which is why the
non-exclusive mode explicitly doesn't do that.

> I think it would be helpful to frame the documentation in a way to
> suggest that the nonexclusive mode is more for automation and more
> sophisticated tools and the exclusive mode is more for manual or simple
> scripted use.

I don't agree with this at all, that's not the reason the two exist nor
were they ever developed with the intent that one is for the 'simple'
case and one is for the 'automated' case.  Trying to wedge them into
that framework strikes me as simply trying to sweep the serious issues
under the rug and I don't agree with that- if we are going to continue
to have this, we need to make it clear what the issues are.  Sadly, we
will still have users who don't actually read the docs that carefully
and get bit by the exclusive backup mode because they didn't appreciate
the issues, but we will continue to have that until we finally remove
the exclusive mode.

> If we do think that the exclusive mode will be removed in PG13, then I
> don't think we need further documentation changes.  It already says it's
> deprecated, and we don't need to justify that at length.  But again, I'm
> not convinced that that will happen.

There were at least a few comments made on this thread that it wasn't
made clear enough in the documentation that it's deprecated.  Saying
that we already deprecated it doesn't change that and doesn't do
anything to actually address those concerns.  Continuing to carry
forward these two modes makes further progress in this area difficult
and unlikely to happen, and that's disappointing.

Thanks!

Stephen

Attachment

Re: Add exclusive backup deprecation notes to documentation

From
Robert Haas
Date:
On Mon, Mar 18, 2019 at 11:13 PM Stephen Frost <sfrost@snowman.net> wrote:
> I don't view it as up for much debate.

In other words, you're not willing to listen to what other people
think about this issue.

I can't say I haven't noticed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Add exclusive backup deprecation notes to documentation

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Mon, Mar 18, 2019 at 11:13 PM Stephen Frost <sfrost@snowman.net> wrote:
> > I don't view it as up for much debate.
>
> In other words, you're not willing to listen to what other people
> think about this issue.

I have listened, but unfortunately the discussion just revolves around
"oh, it isn't actually all that bad", which, no, isn't something that's
going to sway my opinion.

I suppose some might view that as being principled.

Thanks!

Stephen

Attachment

Re: Add exclusive backup deprecation notes to documentation

From
Andres Freund
Date:
Hi,

On 2019-03-18 23:35:07 -0400, Stephen Frost wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
> > On Mon, Mar 18, 2019 at 11:13 PM Stephen Frost <sfrost@snowman.net> wrote:
> > > I don't view it as up for much debate.
> > 
> > In other words, you're not willing to listen to what other people
> > think about this issue.
> 
> I have listened, but unfortunately the discussion just revolves around
> "oh, it isn't actually all that bad", which, no, isn't something that's
> going to sway my opinion.

I think you're right about the original issue. But at some point you
just gotta settle for not everyone agreeing with you (and me). Just
pushing forward hard won't achieve anything.  Nobody says you need to
agree with people saying "it's not all that bad", but that doesn't mean
you should just try to push forward at full speed.

Greetings,

Andres Freund


Re: Add exclusive backup deprecation notes to documentation

From
Stephen Frost
Date:
Greetings,

* Andres Freund (andres@anarazel.de) wrote:
> On 2019-03-18 23:35:07 -0400, Stephen Frost wrote:
> > * Robert Haas (robertmhaas@gmail.com) wrote:
> > > On Mon, Mar 18, 2019 at 11:13 PM Stephen Frost <sfrost@snowman.net> wrote:
> > > > I don't view it as up for much debate.
> > >
> > > In other words, you're not willing to listen to what other people
> > > think about this issue.
> >
> > I have listened, but unfortunately the discussion just revolves around
> > "oh, it isn't actually all that bad", which, no, isn't something that's
> > going to sway my opinion.
>
> I think you're right about the original issue. But at some point you
> just gotta settle for not everyone agreeing with you (and me). Just
> pushing forward hard won't achieve anything.  Nobody says you need to
> agree with people saying "it's not all that bad", but that doesn't mean
> you should just try to push forward at full speed.

This thread didn't exactly pop out of nowhere..  It's a result of the
prior discussion, with the goal of at least improving the situation by
documenting the issues and trying to make it clearer that the exclusive
mode is deprecated to hopefully provide a chance that we'll actually
remove it at some point in the future as being dangerous.

That's a pretty large step back from "let's rip it out for v12", and yet
people are now pushing back against even that, which is definitely
frustrating, and saying that I'm still pushing forward 'at full speed'
definitely comes across as not really considering that this started out
with "rip it out for v12" and has now regressed back to a documentation
patch.

Thanks!

Stephen

Attachment

Re: Add exclusive backup deprecation notes to documentation

From
David Steele
Date:
On 3/18/19 4:33 PM, Peter Eisentraut wrote:
> On 2019-03-07 10:33, David Steele wrote:
>> On 3/1/19 3:14 PM, Laurenz Albe wrote:
>>> Magnus Hagander wrote:
>>>> Maybe have the first note say "This method is deprecated bceause it has serious
>>>> risks (see bellow)" and then list the actual risks at the end?
>>>
>>> Good idea.  That may attract the attention of the dogs among the readers.
>>
>> OK, here's a new version that splits the deprecation notes from the
>> discussion of risks.  I also fixed the indentation.
> 
> The documentation changes appear to continue the theme from the other
> thread that the exclusive backup mode is terrible and everyone should
> feel bad about it.  I don't think there is consensus about that.

I wouldn't characterize documenting the limitations of the method as 
making people feel bad about it.  If you feel my language implies that 
then please let me know where you see it.

> I do welcome a more precise description of the handling of backup_label
> and a better hint in the error message.  I think we haven't gotten to
> the final shape there yet, especially for the latter.  I suggest to
> focus on that.

I was planning to update the error message hint as Magnus and Michael 
have suggested.

I'm not a fan of normalizing the documentation around backup_label, i.e. 
making it seem like a perfectly normal thing that you need to manually 
delete a file to get the cluster to start.  This may still lead users to 
script their way around the problem, with possible corruption as the result.

> The other changes repeat points already made in nearby documentation.

Granted, but in this sense they are meant to concisely describe why the 
feature is deprecated, rather than being instructions in a user guide.

> I think it would be helpful to frame the documentation in a way to
> suggest that the nonexclusive mode is more for automation and more
> sophisticated tools and the exclusive mode is more for manual or simple
> scripted use.

I don't think the features were developed with this in mind and I 
wouldn't want to characterize them this way now.  Non-exclusive mode was 
developed to address the shortcomings of exclusive mode, not as an 
"automate-able" version of it.

Describing any backup method as primarily "manual" in nature seems 
counter-intuitive to me.

> If we do think that the exclusive mode will be removed in PG13, then I
> don't think we need further documentation changes.  It already says it's
> deprecated, and we don't need to justify that at length.  But again, I'm
> not convinced that that will happen.

I think we should remove it entirely in PG13, but I'm not sure if there 
is enough support.  I'll propose it again in the first CF and see where 
it goes.

This patch was intended to be a compromise based on discussion in the 
thread about removing the feature.

If this is now a bridge too far then I'm at a bit of a loss as to how to 
proceed.  If we water it down and normalize it then we are not achieving 
the goals of the patch as I see them -- to steer users away from this 
method when possible and to make it less of a shock if it goes away.

Regards,
-- 
-David
david@pgmasters.net


Re: Add exclusive backup deprecation notes to documentation

From
David Steele
Date:
On 3/8/19 6:08 AM, Magnus Hagander wrote:
> On Thu, Mar 7, 2019 at 5:35 PM Michael Paquier <michael@paquier.xyz 
> <mailto:michael@paquier.xyz>> wrote:
> 
>     On Thu, Mar 07, 2019 at 11:33:20AM +0200, David Steele wrote:
>      > OK, here's a new version that splits the deprecation notes from the
>      > discussion of risks.  I also fixed the indentation.
> 
>     The documentation part looks fine to me.  Just one nit regarding the
>     error hint.
> 
>      > -     errhint("If you are not restoring from a backup, try
>     removing the file \"%s/backup_label\".", DataDir)));
>      > +     errhint("If you are restoring from a backup, touch
>     \"%s/recovery.signal\" and add recovery options to
>     \"%s/postgresql.auto.conf\".\n"
> 
>     Here do we really want to recommend adding options to
>     postgresql.auto.conf?  This depends a lot on the solution integration
>     so I think that this hint could actually confuse some users because it
>     implies that they kind of *have* to do so, which is not correct.  I
>     would recommend to be a bit more generic and just use "and add
>     necessary recovery configuration".
> 
> 
> Agreed, I think we should never tell people to "add recovery options to 
> postgresql.auto.conf". Becuase they should never do that manually. If we 
> want to suggest people use postgresql.auto.conf surely they should be 
> using ALTER SYSTEM SET. Which of course doesn't work in this case, since 
> postgrsql isn't running yet.
> 
> So yeah either that, or say "add to postgresql.conf" without the auto?

I went with Michael's suggestion.  Attached is a new patch.

I also think we should set a flag and throw the error below this if/else 
block.  This is a rather large message and maintaining two copies of it 
is not ideal.

Please note that there have been objections to the patch later in this 
thread by Peter and Robert.  I'm not very interested in watering down 
the documentation changes as Peter suggests, but I think at the very 
least we should commit the added hints in the error message.  For many 
users this error will be their first point of contact with the 
backup_label issue/behavior.

Regards,
-- 
-David
david@pgmasters.net

Attachment

Re: Add exclusive backup deprecation notes to documentation

From
Michael Paquier
Date:
On Wed, Mar 20, 2019 at 04:29:35PM +0400, David Steele wrote:
> Please note that there have been objections to the patch later in this
> thread by Peter and Robert.  I'm not very interested in watering down the
> documentation changes as Peter suggests, but I think at the very least we
> should commit the added hints in the error message.  For many users this
> error will be their first point of contact with the backup_label
> issue/behavior.

The updates of the log message do not imply anything negative as I
read them as they mention to not remove the backup_label file.  So
while we don't have an agreement about the docs, the log messages may
be able to be committed?  Peter?  Robert?

"will result in a corruptED cluster" is more correct?
--
Michael

Attachment

Re: Add exclusive backup deprecation notes to documentation

From
Magnus Hagander
Date:
On Mon, Mar 18, 2019 at 1:33 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 2019-03-07 10:33, David Steele wrote:
> On 3/1/19 3:14 PM, Laurenz Albe wrote:
I think it would be helpful to frame the documentation in a way to
suggest that the nonexclusive mode is more for automation and more
sophisticated tools and the exclusive mode is more for manual or simple
scripted use.

But that would be factually incorrect and backwards, so it seems like a terrible idea, at least when it comes to manual. If you are doing it manually, it's a lot *easier* to do it right with the non-exclusive mode, because you can easily keep one psql and one shell open. And that's safe.

The only real use case that has been put forward for the exclusive backup mode is when the backups are done through a script, and that script is limited to only use something like bash (and can't use a scripting language like perl or python or powershell or other more advanced scripting languages).

And I don't think exclusive mode should be suggested for "simple scripts" either, since it's anything but -- scripts using the exclusive mode correctly will be anything but simple. A better term there would be to single out shellscripts, I'd suggest, if we want to single something out. Or more generic, for "scripting languages incapable of keeping a connection open across multiple lines" or something?

We can certainly keep it, but let's not tell people something is simple when it's not.


If we do think that the exclusive mode will be removed in PG13, then I
don't think we need further documentation changes.  It already says it's
deprecated, and we don't need to justify that at length.  But again, I'm
not convinced that that will happen.

But the complaints before was that the deprecation currently in the documentation was not enough to remove it.... 

--

Re: Add exclusive backup deprecation notes to documentation

From
Robert Haas
Date:
On Wed, Mar 20, 2019 at 9:00 AM Michael Paquier <michael@paquier.xyz> wrote:
> On Wed, Mar 20, 2019 at 04:29:35PM +0400, David Steele wrote:
> > Please note that there have been objections to the patch later in this
> > thread by Peter and Robert.  I'm not very interested in watering down the
> > documentation changes as Peter suggests, but I think at the very least we
> > should commit the added hints in the error message.  For many users this
> > error will be their first point of contact with the backup_label
> > issue/behavior.
>
> The updates of the log message do not imply anything negative as I
> read them as they mention to not remove the backup_label file.  So
> while we don't have an agreement about the docs, the log messages may
> be able to be committed?  Peter?  Robert?
>
> "will result in a corruptED cluster" is more correct?

I really like the proposed changes to the ereport() text.  I think the
"Be careful" hint is a really helpful way of phrasing it.  I think
"corrupt" as the patch has it is slightly better than "corrupted".
Obviously, we have to make the updates for recovery.signal no matter
what, and you could argue that part should be its own commit, but I
like all of the changes.

I'm not too impressed with the documentation changes.  A lot of the
information being added is already present somewhere in that very same
section.  It's reasonable to revise the section so that the dangers
stand out more clearly, but it doesn't seem good to revise it in a way
that ends up duplicating the existing information.  Here's my
suggestion -- ditch the note at the end of the section and make the
one at the beginning read like this:

The exclusive backup method is deprecated and should be avoided.
Prior to <productname>PostgreSQL</productname> 9.6, this was the only
low-level method available, but it is now recommended that all users
upgrade their scripts to use non-exclusive backups.

Then, revise the first paragraph like this:

The process for an exclusive backup is mostly the same as for a
non-exclusive one, but it differs in a few key steps. This type of
backup can only be taken on a primary and does not allow concurrent
backups.  Moreover, because it writes a backup_label file on the
master, it can cause the master to fail to restart automatically after
a crash.  On the other hand, the erroneous removal of a backup_label
file from a backup or standby is a common mistake which can can result
in serious data corruption.  If it is necessary to use this method,
the following steps may be used.

Later, where it says:

       Note that if the server crashes during the backup it may not be
       possible to restart until the <literal>backup_label</literal>
file has been
       manually deleted from the <envar>PGDATA</envar> directory.

Change it to read:

As noted above, if the server crashes during the backup it may not be
possible to restart until the <literal>backup_label</literal> file has
been manually deleted from the <envar>PGDATA</envar> directory.  Note
that it is very important to never to remove the
<literal>backup_label</literal> file when restoring a backup, because
this will result in corruption.  Confusion about the circumstances
under which it is appropriate to remove this file is a common cause of
data corruption when using this method; be very certain that you
remove the file only on an existing master and never when building a
standby or restoring a backup, even if you are building a standby that
will subsequently be promoted to a new master.

I also think we should revise this thoroughly terrible advice:

      If you wish to place a time limit on the execution of
      <function>pg_stop_backup</function>, set an appropriate
      <varname>statement_timeout</varname> value, but make note that if
      <function>pg_stop_backup</function> terminates because of this your backup
      may not be valid.

That seems awful, not only because it encourages people to do that
particular thing and end up leaving the server in backup mode, but
also because it doesn't clearly articulate the extreme importance of
making sure that the server is not left in backup mode.  So I would
propose that we strike that text entirely and replace it with
something like:

When using exclusive backup mode, it is absolutely imperative to make
sure that <function>pg_stop_backup</function> completes successfully
at the end of the backup.  Even if the backup itself fails, for
example due to lack of disk space, failure to call
<function>pg_stop_backup</function> will leave the server in backup
mode indefinitely, causing future backups to fail and increasing the
risk of a restart during a time when <literal>backup_label</literal>
exists.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Add exclusive backup deprecation notes to documentation

From
David Steele
Date:
Hi Robert,

On 3/20/19 6:31 PM, Robert Haas wrote:
> On Wed, Mar 20, 2019 at 9:00 AM Michael Paquier <michael@paquier.xyz> wrote:
>> On Wed, Mar 20, 2019 at 04:29:35PM +0400, David Steele wrote:
>>> Please note that there have been objections to the patch later in this
>>> thread by Peter and Robert.  I'm not very interested in watering down the
>>> documentation changes as Peter suggests, but I think at the very least we
>>> should commit the added hints in the error message.  For many users this
>>> error will be their first point of contact with the backup_label
>>> issue/behavior.
>>
>> The updates of the log message do not imply anything negative as I
>> read them as they mention to not remove the backup_label file.  So
>> while we don't have an agreement about the docs, the log messages may
>> be able to be committed?  Peter?  Robert?
>>
>> "will result in a corruptED cluster" is more correct?
> 
> I really like the proposed changes to the ereport() text.  I think the
> "Be careful" hint is a really helpful way of phrasing it.  I think
> "corrupt" as the patch has it is slightly better than "corrupted".
> Obviously, we have to make the updates for recovery.signal no matter
> what, and you could argue that part should be its own commit, but I
> like all of the changes.

Cool.

> I'm not too impressed with the documentation changes.  A lot of the
> information being added is already present somewhere in that very same
> section.  It's reasonable to revise the section so that the dangers
> stand out more clearly, but it doesn't seem good to revise it in a way
> that ends up duplicating the existing information.  

OK.

> Here's my
> suggestion -- ditch the note at the end of the section and make the
> one at the beginning read like this:
> 
> The exclusive backup method is deprecated and should be avoided.
> Prior to <productname>PostgreSQL</productname> 9.6, this was the only
> low-level method available, but it is now recommended that all users
> upgrade their scripts to use non-exclusive backups.
> 
> Then, revise the first paragraph like this:
> 
> The process for an exclusive backup is mostly the same as for a
> non-exclusive one, but it differs in a few key steps. This type of
> backup can only be taken on a primary and does not allow concurrent
> backups.  Moreover, because it writes a backup_label file on the
> master, it can cause the master to fail to restart automatically after
> a crash.  On the other hand, the erroneous removal of a backup_label
> file from a backup or standby is a common mistake which can can result
> in serious data corruption.  If it is necessary to use this method,
> the following steps may be used.

This works for me as I feel like the cautions here (and below) are still 
strongly worded.  Peter?

> Later, where it says:
> 
>         Note that if the server crashes during the backup it may not be
>         possible to restart until the <literal>backup_label</literal>
> file has been
>         manually deleted from the <envar>PGDATA</envar> directory.
> 
> Change it to read:
> 
> As noted above, if the server crashes during the backup it may not be
> possible to restart until the <literal>backup_label</literal> file has
> been manually deleted from the <envar>PGDATA</envar> directory.  Note
> that it is very important to never to remove the
> <literal>backup_label</literal> file when restoring a backup, because
> this will result in corruption.  Confusion about the circumstances
> under which it is appropriate to remove this file is a common cause of
> data corruption when using this method; be very certain that you
> remove the file only on an existing master and never when building a
> standby or restoring a backup, even if you are building a standby that
> will subsequently be promoted to a new master.

Technically we are into repetition here, but I'm certainly OK with it as 
this point bears repeating.

> I also think we should revise this thoroughly terrible advice:
> 
>        If you wish to place a time limit on the execution of
>        <function>pg_stop_backup</function>, set an appropriate
>        <varname>statement_timeout</varname> value, but make note that if
>        <function>pg_stop_backup</function> terminates because of this your backup
>        may not be valid.
> 
> That seems awful, not only because it encourages people to do that
> particular thing and end up leaving the server in backup mode, but
> also because it doesn't clearly articulate the extreme importance of
> making sure that the server is not left in backup mode.  So I would
> propose that we strike that text entirely and replace it with
> something like:
> 
> When using exclusive backup mode, it is absolutely imperative to make
> sure that <function>pg_stop_backup</function> completes successfully
> at the end of the backup.  Even if the backup itself fails, for
> example due to lack of disk space, failure to call
> <function>pg_stop_backup</function> will leave the server in backup
> mode indefinitely, causing future backups to fail and increasing the
> risk of a restart during a time when <literal>backup_label</literal>
> exists.

It's also pretty important that exclusive backups complete successfully 
since backup_label is returned from pg_stop_backup() -- the backup will 
definitely be corrupt without that if there was a checkpoint during the 
backup.  But, yeah, leaving a backup_label around for a long time 
increases the restart risks a lot.

I'll revise the patch if Peter thinks this approach looks reasonable.

Thanks,
-- 
-David
david@pgmasters.net


Re: Add exclusive backup deprecation notes to documentation

From
David Steele
Date:
Hi Robert,

On 3/20/19 5:08 PM, David Steele wrote:
> 
> I'll revise the patch if Peter thinks this approach looks reasonable.

Hopefully Peter's silence can be interpreted as consent.  Probably just 
busy, though.

I used your suggestions with minor editing.  After some reflection, I 
agree that the inline warnings are likely to be more effective than 
something at the end, at least for those working on a new implementation.

Regards,
-- 
-David
david@pgmasters.net

Attachment

Re: Add exclusive backup deprecation notes to documentation

From
Peter Eisentraut
Date:
On 2019-03-20 14:42, Magnus Hagander wrote:
> But that would be factually incorrect and backwards, so it seems like a
> terrible idea, at least when it comes to manual. If you are doing it
> manually, it's a lot *easier* to do it right with the non-exclusive
> mode, because you can easily keep one psql and one shell open. And
> that's safe.

The scenario I have in mind is, a poorly maintained server, nothing
installed, can't install anything (no internet connection, license
expired), flaky network, you fear it's going to fail soon, you need to
take a backup.  The simplest procedure would appear to be: start backup
mode, copy files away, stop backup mode.  Anything else that involves
holding a session open over there for the whole time is way more fragile
unless proper preparations have been made (and even then).  So I don't
know what you want to call that scenario, but I would feel more
comfortable having these basic tools available in a bind.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add exclusive backup deprecation notes to documentation

From
Robert Haas
Date:
On Fri, Mar 29, 2019 at 6:27 AM David Steele <david@pgmasters.net> wrote:
> I used your suggestions with minor editing.  After some reflection, I
> agree that the inline warnings are likely to be more effective than
> something at the end, at least for those working on a new implementation.

I'm glad we could agree on something.  Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Add exclusive backup deprecation notes to documentation

From
Magnus Hagander
Date:
On Fri, Mar 29, 2019 at 1:19 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Mar 29, 2019 at 6:27 AM David Steele <david@pgmasters.net> wrote:
> I used your suggestions with minor editing.  After some reflection, I
> agree that the inline warnings are likely to be more effective than
> something at the end, at least for those working on a new implementation.

I'm glad we could agree on something.  Committed.

+1, thanks.

Minor nitpick:
+     backup can only be taken on a primary and does not allow concurrent
+     backups.  Moreover, because it writes a backup_label file on the
+     master, it can cause the master to fail to restart automatically after

Let's be consistent in if we call it a primary or a master, at least within the same paragraph :) 

--

Re: Add exclusive backup deprecation notes to documentation

From
Peter Eisentraut
Date:
On 2019-03-29 11:27, David Steele wrote:
>> I'll revise the patch if Peter thinks this approach looks reasonable.
> 
> Hopefully Peter's silence can be interpreted as consent.  Probably just 
> busy, though.
> 
> I used your suggestions with minor editing.  After some reflection, I 
> agree that the inline warnings are likely to be more effective than 
> something at the end, at least for those working on a new implementation.

It looks very sensible now, I think.  Thanks.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add exclusive backup deprecation notes to documentation

From
David Steele
Date:
On 3/29/19 11:58 AM, Peter Eisentraut wrote:
> On 2019-03-20 14:42, Magnus Hagander wrote:
>> But that would be factually incorrect and backwards, so it seems like a
>> terrible idea, at least when it comes to manual. If you are doing it
>> manually, it's a lot *easier* to do it right with the non-exclusive
>> mode, because you can easily keep one psql and one shell open. And
>> that's safe.
> 
> The scenario I have in mind is, a poorly maintained server, nothing
> installed, can't install anything (no internet connection, license
> expired), flaky network, you fear it's going to fail soon, you need to
> take a backup.  The simplest procedure would appear to be: start backup
> mode, copy files away, stop backup mode.  Anything else that involves
> holding a session open over there for the whole time is way more fragile
> unless proper preparations have been made (and even then).  So I don't
> know what you want to call that scenario, but I would feel more
> comfortable having these basic tools available in a bind.

I would argue the best thing in this scenario is to use pg_basebackup. 
It's a solid tool and likely far better than any script the user might 
cook up on the spot.

Regards,
-- 
-David
david@pgmasters.net



Re: Add exclusive backup deprecation notes to documentation

From
David Steele
Date:
On 3/29/19 12:25 PM, Magnus Hagander wrote:
> On Fri, Mar 29, 2019 at 1:19 PM Robert Haas <robertmhaas@gmail.com 
> <mailto:robertmhaas@gmail.com>> wrote:
> 
>     On Fri, Mar 29, 2019 at 6:27 AM David Steele <david@pgmasters.net
>     <mailto:david@pgmasters.net>> wrote:
>      > I used your suggestions with minor editing.  After some reflection, I
>      > agree that the inline warnings are likely to be more effective than
>      > something at the end, at least for those working on a new
>     implementation.
> 
>     I'm glad we could agree on something.  Committed.

Me, too.  Thanks!

> Minor nitpick:
> +     backup can only be taken on a primary and does not allow concurrent
> +     backups.  Moreover, because it writes a backup_label file on the
> +     master, it can cause the master to fail to restart automatically after
> 
> Let's be consistent in if we call it a primary or a master, at least 
> within the same paragraph :)

Agreed, let's stick with "primary".

Are we planning to back-patch this?  The deprecation was added to the
docs in 9.6 -- I think these clarifications would be helpful.

Thanks,
-- 
-David
david@pgmasters.net



Re: Add exclusive backup deprecation notes to documentation

From
Robert Haas
Date:
On Fri, Mar 29, 2019 at 8:45 AM David Steele <david@pgmasters.net> wrote:
> Are we planning to back-patch this?  The deprecation was added to the
> docs in 9.6 -- I think these clarifications would be helpful.

I wasn't planning too, but I guess we could consider it.  I'd be more
inclined to back-patch the documentation changes than the message text
changes, but what do other people think?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Add exclusive backup deprecation notes to documentation

From
David Steele
Date:
On 3/29/19 12:46 PM, Robert Haas wrote:
> On Fri, Mar 29, 2019 at 8:45 AM David Steele <david@pgmasters.net> wrote:
>> Are we planning to back-patch this?  The deprecation was added to the
>> docs in 9.6 -- I think these clarifications would be helpful.
> 
> I wasn't planning too, but I guess we could consider it.  I'd be more
> inclined to back-patch the documentation changes than the message text
> changes, but what do other people think?

I think we should definitely do the docs, I'm 50% on the message.  You 
could argue that is is a behavioral change, but it is pretty important info.

Regards,
-- 
-David
david@pgmasters.net



Re: Add exclusive backup deprecation notes to documentation

From
Magnus Hagander
Date:
On Fri, Mar 29, 2019 at 1:49 PM David Steele <david@pgmasters.net> wrote:
On 3/29/19 12:46 PM, Robert Haas wrote:
> On Fri, Mar 29, 2019 at 8:45 AM David Steele <david@pgmasters.net> wrote:
>> Are we planning to back-patch this?  The deprecation was added to the
>> docs in 9.6 -- I think these clarifications would be helpful.
>
> I wasn't planning too, but I guess we could consider it.  I'd be more
> inclined to back-patch the documentation changes than the message text
> changes, but what do other people think?

I think we should definitely do the docs, I'm 50% on the message.  You
could argue that is is a behavioral change, but it is pretty important info.

+1 on the docs.

Is the changes to the messages going to cause issues or weirdness for translators? That would be a reason not to backpatch it. Without that, I'm leaning towards backpatching it. 

--

Re: Add exclusive backup deprecation notes to documentation

From
Peter Eisentraut
Date:
On 2019-03-29 13:54, Magnus Hagander wrote:
> Is the changes to the messages going to cause issues or weirdness for
> translators? That would be a reason not to backpatch it. Without that,
> I'm leaning towards backpatching it. 

Note that the messages refer to recovery.signal, so a backpatch would
have to rephrase the message.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services