Re: Missing important information in backup.sgml - Mailing list pgsql-docs

From Stephen Frost
Subject Re: Missing important information in backup.sgml
Date
Msg-id 20161123204139.GZ13284@tamriel.snowman.net
Whole thread Raw
In response to Re: Missing important information in backup.sgml  ("Gunnar \"Nick\" Bluth" <gunnar.bluth@pro-open.de>)
Responses Re: Missing important information in backup.sgml  ("Gunnar \"Nick\" Bluth" <gunnar.bluth@pro-open.de>)
List pgsql-docs
Greetings,

* Gunnar "Nick" Bluth (gunnar.bluth@pro-open.de) wrote:
> Now, what could happen is

All you need is for the archive server to be reset at the wrong time
(and there are a lot of potential "wrong times" to choose from) and
you'll end up with an incomplete archive.

> a) complete DC power outage
> b) outage of DB server
> c) outage of archive server (or the network connection to it)
> d) outage of storage system
> e) complete DC outage caused by your DB server vanishing (burning down,
> exploding, melting, ...),
> f) a complete _loss_ of the DC (atomar strike, plane crash, ...)
>
> In case a), your DB server would have fsync'd all committed transactions
> => no _data_ loss, but your _archive_ is potentially incomplete.

To be clear, the concern that I was pointing out is primairly that the
archive could end up incomplete and potentially render significant
portions of your archive as unusable (as in, everything since the event
til the next backup).

> In case b), the same applies, but your archive should be intact.

That isn't entirely accurate as you'll lose whatever happened since the
last WAL segment was shipped to the archive server, but that's true in
general unless you're using pg_receivexlog with sync mode.  Of course,
anything archive_command-based will have this issue.

On the other hand, you mentioned DB and archive on the same system, in
which case an inopportune reset of that server could result in an
incomplete archive, though you shouldn't lose any data in the database
assuming you can get the disks back.

> In case c), the archiver would retry until your archiving server comes
> back online => no _data_ loss, no _archive_ loss.

That depends entirely upon the circumstances of the archive server
outage- if the archive server is reset at the wrong time, you will
almost certainly lose some about of your archive.  If you just lose
network connectivity then you should be ok- if the command you're
using for archive_command returns the correct error code in that case.

> So, losing actual _data_ is unlikely (at least from the archiving point
> of view...), but not explicitly fsync'ing the archive _may_ lead to
> incomplete archives. Which is exactly what I tried to point out by
> "[...], rendering your archive incomplete in case of a power outage".

One of the very important things that should be done as part of a backup
is to ensure that all of the archive files required to restore the
database to a consistent state are safely stored in the archive.  If
that isn't done then it's possible that an incomplete archive may also
render backups invalid.

> Am I missing something?

For my 2c, at least, the archive should be viewed with nearly the same
care and consideration as the primary data.  As with your database, you
really want your backups to work when you need them.

Thanks!

Stephen

Attachment

pgsql-docs by date:

Previous
From: "Gunnar \"Nick\" Bluth"
Date:
Subject: Re: Missing important information in backup.sgml
Next
From: Kevin Grittner
Date:
Subject: Re: Missing important information in backup.sgml