Thread: Missing important information in backup.sgml
Hi, I ran into this issue (see patch) a few times over the past years, and tend to forget it again (sigh!). Today I had to clean up a few hundred GB of unarchived WALs, so I decided to write a patch for the documentation this time. diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index 6eaed1e..460a4f2 100644 --- a/doc/src/sgml/backup.sgml +++ b/doc/src/sgml/backup.sgml @@ -636,7 +636,13 @@ test ! -f /mnt/server/archivedir/00000001000000A900000065 && cp pg_wal/0 <productname>PostgreSQL</> will assume that the file has been successfully archived, and will remove or recycle it. However, a nonzero status tells <productname>PostgreSQL</> that the file was not archived; - it will try again periodically until it succeeds. + it will try again periodically until it succeeds. Note that an exit + status of 128 or higher will cause the archiver to exit and <emphasis>not + </> try to archive the affected WAL segment anymore. E.g., <command>rsync</> + tends to return an exit status of 255 when it is unable to resolve a + hostname, which - when returned unfiltered - will result in permanent + failure to archive the segment, even if the resolving problem is only + temporarily. </para> <para> Best regards, -- Gunnar "Nick" Bluth DBA ELSTER Tel: +49 911/991-4665 Mobil: +49 172/8853339
Attachment
Am 16.11.2016 um 11:37 schrieb Gunnar "Nick" Bluth: > Hi, > > I ran into this issue (see patch) a few times over the past years, and > tend to forget it again (sigh!). Today I had to clean up a few hundred > GB of unarchived WALs, so I decided to write a patch for the > documentation this time. Uhm, well, the actual problem was a stale replication slot... and tomatoes on my eyes, it seems ;-/. Ashes etc.! However, I still think a warning on (esp. rsync's) RCs >= 128 is worth considering (see -v2 attached). Cheers, -- Gunnar "Nick" Bluth DBA ELSTER Tel: +49 911/991-4665 Mobil: +49 172/8853339
Attachment
Gunnar, all, * Gunnar "Nick" Bluth (gunnar.bluth.extern@elster.de) wrote: > Am 16.11.2016 um 11:37 schrieb Gunnar "Nick" Bluth: > > I ran into this issue (see patch) a few times over the past years, and > > tend to forget it again (sigh!). Today I had to clean up a few hundred > > GB of unarchived WALs, so I decided to write a patch for the > > documentation this time. > > Uhm, well, the actual problem was a stale replication slot... and > tomatoes on my eyes, it seems ;-/. Ashes etc.! > > However, I still think a warning on (esp. rsync's) RCs >= 128 is worth > considering (see -v2 attached). Frankly, I wouldn't suggest including such wording as it would imply that using a bare rsync command is an acceptable configuration of archive_command. It isn't. At the very least, a bare rsync does nothing to ensure that the WAL has been fsync'd to permanent storage before returning, leading to potential data loss due to the WAL segment being removed by PG before the new segment has been permanently stored. The PG documentation around archive command is, at best, a starting point for individuals who wish to implement their own proper backup solution, not as examples of good practice for production environments. Thanks! Stephen
Attachment
Am 16.11.2016 um 15:36 schrieb Stephen Frost: > Gunnar, all, > > * Gunnar "Nick" Bluth (gunnar.bluth.extern@elster.de) wrote: >> Am 16.11.2016 um 11:37 schrieb Gunnar "Nick" Bluth: >>> I ran into this issue (see patch) a few times over the past years, and >>> tend to forget it again (sigh!). Today I had to clean up a few hundred >>> GB of unarchived WALs, so I decided to write a patch for the >>> documentation this time. >> >> Uhm, well, the actual problem was a stale replication slot... and >> tomatoes on my eyes, it seems ;-/. Ashes etc.! >> >> However, I still think a warning on (esp. rsync's) RCs >= 128 is worth >> considering (see -v2 attached). > > Frankly, I wouldn't suggest including such wording as it would imply > that using a bare rsync command is an acceptable configuration of > archive_command. It isn't. At the very least, a bare rsync does > nothing to ensure that the WAL has been fsync'd to permanent storage > before returning, leading to potential data loss due to the WAL > segment being removed by PG before the new segment has been permanently > stored. I for myself deem a UPS-backed server in a different DC a pretty good starting point, and I reckon many others do as well... obviously it's not a belt and bracers solution, but my guess would be that > 90% of users have something similar in place, many of them actually using rsync (or scp) one way or the other (or, think WAL-E et. al., how do you force an fsync on AWS?!?). In environments where there's a risk of the WAL segment being overwritten before that target server has fsync'd, heck, yeah, you're right. But then you'd probably have something quite sophisticated in place, and hate to see allegedly random _FATAL_ errors that are _not documented outside the source code_ even more. Esp. when you can't tell for sure (from the docs) if archiving your WAL segment will be retried or not. > The PG documentation around archive command is, at best, a starting > point for individuals who wish to implement their own proper backup > solution, not as examples of good practice for production environments. True. Which doesn't mean there's no room for more hints, like "ok, we throw a FATAL error sometimes, but they're not really a problem, you know, it's just external software that basically everyone uses at one point or the other doing odd things sometimes" ;-). Alas, I've been hunting a red herring today, cause when you find your pg_xlog cluttered with old files _and_ see FATAL archiving messages in your logs, your first thought is not "there's prolly a replication slot left over", but "uh oh, those archive_command calls failed, so something might be somehow stuck now". I'll try to come up with something more comprehensive, taking your comments into account... > Thanks! > > Stephen Thank you for considering this! ;-) Cheers, -- Gunnar "Nick" Bluth RHCE/SCLA Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de _____________________________________________________________ In 1984 mainstream users were choosing VMS over UNIX. Ten years later they are choosing Windows over UNIX. What part of that message aren't you getting? - Tom Payne
Attachment
Am 16.11.2016 um 22:07 schrieb Gunnar "Nick" Bluth: > Am 16.11.2016 um 15:36 schrieb Stephen Frost: >> Gunnar, all, >> >> * Gunnar "Nick" Bluth (gunnar.bluth.extern@elster.de) wrote: >>> Am 16.11.2016 um 11:37 schrieb Gunnar "Nick" Bluth: >>>> I ran into this issue (see patch) a few times over the past years, and >>>> tend to forget it again (sigh!). Today I had to clean up a few hundred >>>> GB of unarchived WALs, so I decided to write a patch for the >>>> documentation this time. >>> >>> Uhm, well, the actual problem was a stale replication slot... and >>> tomatoes on my eyes, it seems ;-/. Ashes etc.! >>> >>> However, I still think a warning on (esp. rsync's) RCs >= 128 is worth >>> considering (see -v2 attached). >> >> Frankly, I wouldn't suggest including such wording as it would imply >> that using a bare rsync command is an acceptable configuration of >> archive_command. It isn't. At the very least, a bare rsync does >> nothing to ensure that the WAL has been fsync'd to permanent storage >> before returning, leading to potential data loss due to the WAL >> segment being removed by PG before the new segment has been permanently >> stored. > > I for myself deem a UPS-backed server in a different DC a pretty good > starting point, and I reckon many others do as well... obviously it's > not a belt and bracers solution, but my guess would be that > 90% of > users have something similar in place, many of them actually using rsync > (or scp) one way or the other (or, think WAL-E et. al., how do you force > an fsync on AWS?!?). > In environments where there's a risk of the WAL segment being > overwritten before that target server has fsync'd, heck, yeah, you're > right. But then you'd probably have something quite sophisticated in > place, and hate to see allegedly random _FATAL_ errors that are _not > documented outside the source code_ even more. Esp. when you can't tell > for sure (from the docs) if archiving your WAL segment will be retried > or not. > >> The PG documentation around archive command is, at best, a starting >> point for individuals who wish to implement their own proper backup >> solution, not as examples of good practice for production environments. > > True. Which doesn't mean there's no room for more hints, like "ok, we > throw a FATAL error sometimes, but they're not really a problem, you > know, it's just external software that basically everyone uses at one > point or the other doing odd things sometimes" ;-). > > Alas, I've been hunting a red herring today, cause when you find your > pg_xlog cluttered with old files _and_ see FATAL archiving messages in > your logs, your first thought is not "there's prolly a replication slot > left over", but "uh oh, those archive_command calls failed, so something > might be somehow stuck now". > > I'll try to come up with something more comprehensive, taking your > comments into account... So, attached is what I came up with. It's obviously not "complete", however it points out the RC >= 128 "quirk" and also mentions Stephen's remarks on rsync (although to get actual _data loss_, you'd have to have a power outage in the DC caused by your PG server exploding... ;-). Cheers, -- Gunnar "Nick" Bluth RHCE/SCLA Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de _____________________________________________________________ In 1984 mainstream users were choosing VMS over UNIX. Ten years later they are choosing Windows over UNIX. What part of that message aren't you getting? - Tom Payne
Attachment
On Wed, Nov 23, 2016 at 12:24 PM, Gunnar "Nick" Bluth <gunnar.bluth@pro-open.de> wrote: > mentions Stephen's > remarks on rsync (although to get actual _data loss_, you'd have to have > a power outage in the DC caused by your PG server exploding... ;-). I have seen power loss between the UPS and a server; including a tech tripping on the power cord. I have also seen servers abruptly shut down due to high temperatures in spite of having a UPS. I have also seen an OS bug lock up a system such that it was impossible to get a clean shutdown before having to cycle power to recover. No explosion needed. If you value the data in your database you should assume that the OS could go down at any instant without proper shutdown, and that your storage system(s) could be lost without warning at any time. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Am 23.11.2016 um 20:21 schrieb Kevin Grittner: > On Wed, Nov 23, 2016 at 12:24 PM, Gunnar "Nick" Bluth > <gunnar.bluth@pro-open.de> wrote: > >> mentions Stephen's >> remarks on rsync (although to get actual _data loss_, you'd have to have >> a power outage in the DC caused by your PG server exploding... ;-). > > I have seen power loss between the UPS and a server; including a > tech tripping on the power cord. I have also seen servers abruptly > shut down due to high temperatures in spite of having a UPS. I > have also seen an OS bug lock up a system such that it was > impossible to get a clean shutdown before having to cycle power to > recover. > > No explosion needed. > > If you value the data in your database you should assume that the > OS could go down at any instant without proper shutdown, and that > your storage system(s) could be lost without warning at any time. Kevin, all, I've been in this business for 15 years, and had my share of outages. The worst case being an AC service guy pushing the big red button next to the DC entrance, assuming it was the light switch... It's not like I've not gone through the possible scenarios in my head before writing such a broad statement. Let me explain. Assertions (that I take as givens for anyone valueing his data...): - you have decent HW (BBU controller, HDD cache off, ECC RAM, redundant PSUs, ...) - you have a decent DC (UPS, AC, ...) - you use a single DB server and/or no (synchronous) replication in place - your archive server is in the same DC (potentially the same machine as the DB server) - (in case of SAN) your storage correctly reports when it has written to disk/BBU cache - your OS (and/or archive_script) does not report RC=0 before all data has been _transmitted_ (think MongoDB... ;-) - (for the sake of completeness) fsync=on for PG Now, what could happen is a) complete DC power outage b) outage of DB server c) outage of archive server (or the network connection to it) d) outage of storage system e) complete DC outage caused by your DB server vanishing (burning down, exploding, melting, ...), f) a complete _loss_ of the DC (atomar strike, plane crash, ...) In case a), your DB server would have fsync'd all committed transactions => no _data_ loss, but your _archive_ is potentially incomplete. In case b), the same applies, but your archive should be intact. In case c), the archiver would retry until your archiving server comes back online => no _data_ loss, no _archive_ loss. In case d), see a), if you're lucky b) In case e), you'd have lost your DB _and_ your archive may be incomplete. In case f), your f)....d anyway (oh, the coincidence! ;-). Protecting yourself from case f) will involve a 2nd (3rd, ...) DC (or some cloud thingie) anyway. In my experience, users that do have more than one DC also have a policy in place saying that backups (which archive logs would probably be counted as) have to be placed in a different DC. So, losing actual _data_ is unlikely (at least from the archiving point of view...), but not explicitly fsync'ing the archive _may_ lead to incomplete archives. Which is exactly what I tried to point out by "[...], rendering your archive incomplete in case of a power outage". Am I missing something? P.S.: just to point that out... my patch does _not_ mention exploding servers ;-) Cheers, -- Gunnar "Nick" Bluth RHCE/SCLA Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de _____________________________________________________________ In 1984 mainstream users were choosing VMS over UNIX. Ten years later they are choosing Windows over UNIX. What part of that message aren't you getting? - Tom Payne
Attachment
Am 23.11.2016 um 21:12 schrieb Gunnar "Nick" Bluth: > Assertions (that I take as givens for anyone valueing his data...): > - you have decent HW (BBU controller, HDD cache off, ECC RAM, redundant > PSUs, ...) > - you have a decent DC (UPS, AC, ...) Erm... on rereading my own shit^H^H^H^Hmail... these two I obviously do not rate as "given for anyone valueing his data": > - you use a single DB server and/or no (synchronous) replication in place > - your archive server is in the same DC (potentially the same machine as > the DB server) > - (in case of SAN) your storage correctly reports when it has written to > disk/BBU cache > - your OS (and/or archive_script) does not report RC=0 before all data > has been _transmitted_ (think MongoDB... ;-) > - (for the sake of completeness) fsync=on for PG Another "funny" occasion btw. was a guy accidentaly pressing stop-A in a SUN F6800 console, and being overly nervous entering "boot"... the POST alone took 40 minutes, and then it started polling the dedicated Clariion II ;-) Cheers, -- Gunnar "Nick" Bluth RHCE/SCLA Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de _____________________________________________________________ In 1984 mainstream users were choosing VMS over UNIX. Ten years later they are choosing Windows over UNIX. What part of that message aren't you getting? - Tom Payne
Attachment
Greetings, * Gunnar "Nick" Bluth (gunnar.bluth@pro-open.de) wrote: > Now, what could happen is All you need is for the archive server to be reset at the wrong time (and there are a lot of potential "wrong times" to choose from) and you'll end up with an incomplete archive. > a) complete DC power outage > b) outage of DB server > c) outage of archive server (or the network connection to it) > d) outage of storage system > e) complete DC outage caused by your DB server vanishing (burning down, > exploding, melting, ...), > f) a complete _loss_ of the DC (atomar strike, plane crash, ...) > > In case a), your DB server would have fsync'd all committed transactions > => no _data_ loss, but your _archive_ is potentially incomplete. To be clear, the concern that I was pointing out is primairly that the archive could end up incomplete and potentially render significant portions of your archive as unusable (as in, everything since the event til the next backup). > In case b), the same applies, but your archive should be intact. That isn't entirely accurate as you'll lose whatever happened since the last WAL segment was shipped to the archive server, but that's true in general unless you're using pg_receivexlog with sync mode. Of course, anything archive_command-based will have this issue. On the other hand, you mentioned DB and archive on the same system, in which case an inopportune reset of that server could result in an incomplete archive, though you shouldn't lose any data in the database assuming you can get the disks back. > In case c), the archiver would retry until your archiving server comes > back online => no _data_ loss, no _archive_ loss. That depends entirely upon the circumstances of the archive server outage- if the archive server is reset at the wrong time, you will almost certainly lose some about of your archive. If you just lose network connectivity then you should be ok- if the command you're using for archive_command returns the correct error code in that case. > So, losing actual _data_ is unlikely (at least from the archiving point > of view...), but not explicitly fsync'ing the archive _may_ lead to > incomplete archives. Which is exactly what I tried to point out by > "[...], rendering your archive incomplete in case of a power outage". One of the very important things that should be done as part of a backup is to ensure that all of the archive files required to restore the database to a consistent state are safely stored in the archive. If that isn't done then it's possible that an incomplete archive may also render backups invalid. > Am I missing something? For my 2c, at least, the archive should be viewed with nearly the same care and consideration as the primary data. As with your database, you really want your backups to work when you need them. Thanks! Stephen
Attachment
On Wed, Nov 23, 2016 at 2:12 PM, Gunnar "Nick" Bluth <gunnar.bluth@pro-open.de> wrote: > Am 23.11.2016 um 20:21 schrieb Kevin Grittner: >> On Wed, Nov 23, 2016 at 12:24 PM, Gunnar "Nick" Bluth <gunnar.bluth@pro-open.de> wrote: >>> to get actual _data loss_, you'd have to have a power outage in >>> the DC caused by your PG server exploding... >> If you value the data in your database you should assume that the >> OS could go down at any instant without proper shutdown, and that >> your storage system(s) could be lost without warning at any time. > I've been in this business for 15 years, and had my share of outages. > [enumeration of various failure modes and their respective impacts] I have seen multiple posts on these lists where people say that they don't need to worry about data written to the OS failing to make it to storage because they a really good UPS. It sounds like you understand that, as important as it is to have a UPS, it does not preclude abrupt equipment failure; but I fear that the statement you made might reinforce that notion in some quarters. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Am 23.11.2016 um 21:41 schrieb Stephen Frost: > Greetings, Greetings, and excuse my persistence on this ;-), > * Gunnar "Nick" Bluth (gunnar.bluth@pro-open.de) wrote: >> Now, what could happen is -- discussion of different disaster scenarios, boiling down to a differing definition of the term "data loss"... I myself took a slightly more generous approach, based on the idea that after an outage of the archiving destination, a DBA would probably initiate a full backup straight away (and/or try a restore). -- However, to get this on track again...: > One of the very important things that should be done as part of a backup > is to ensure that all of the archive files required to restore the > database to a consistent state are safely stored in the archive. If > that isn't done then it's possible that an incomplete archive may also > render backups invalid. Well, the need to have a complete archive is described in the docs already. Maybe the potential consequences of an incomplete archive should be pointed or more drastically...? >> Am I missing something? > > For my 2c, at least, the archive should be viewed with nearly the same > care and consideration as the primary data. As with your database, you > really want your backups to work when you need them. We're certainly on the same page! Now, the main purpose of my patch was to document a behaviour that many of us have run into, namely that FATAL error showing up in the log when the archive_command exits with RC > 127. It's a nuisance only, but it does send people on false tracks and should at least be mentioned in the documentation. And since a couple of people does use rsync (or some wrappers around it) for archiving, and that is notoriously giving RCs > 127, it seems legit to at least mention it, no? What think you? -- Gunnar "Nick" Bluth RHCE/SCLA Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de _____________________________________________________________ In 1984 mainstream users were choosing VMS over UNIX. Ten years later they are choosing Windows over UNIX. What part of that message aren't you getting? - Tom Payne
Attachment
Greetings, * Gunnar "Nick" Bluth (gunnar.bluth@pro-open.de) wrote: > Am 23.11.2016 um 21:41 schrieb Stephen Frost: > > * Gunnar "Nick" Bluth (gunnar.bluth@pro-open.de) wrote: > > One of the very important things that should be done as part of a backup > > is to ensure that all of the archive files required to restore the > > database to a consistent state are safely stored in the archive. If > > that isn't done then it's possible that an incomplete archive may also > > render backups invalid. > > Well, the need to have a complete archive is described in the docs > already. Maybe the potential consequences of an incomplete archive > should be pointed or more drastically...? We should probably add to "step 5" something about "verify that all WAL files have been archived between the start and stop backup" or similar. > Now, the main purpose of my patch was to document a behaviour that many > of us have run into, namely that FATAL error showing up in the log when > the archive_command exits with RC > 127. It's a nuisance only, but it > does send people on false tracks and should at least be mentioned in the > documentation. I agree that we should add information to the documentation that certain error codes are handled differently. > And since a couple of people does use rsync (or some wrappers around it) > for archiving, and that is notoriously giving RCs > 127, it seems legit > to at least mention it, no? The low-level API documentation should be focused on the API and not how to (mis)use the API using common unix commands. > diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml > index 6eaed1e..a8f574e 100644 > --- a/doc/src/sgml/backup.sgml > +++ b/doc/src/sgml/backup.sgml > @@ -587,7 +587,8 @@ tar -cf backup.tar /usr/local/pgsql/data > the administrator specify a shell command to be executed to copy a > completed segment file to wherever it needs to go. The command could be > as simple as a <literal>cp</>, or it could invoke a complex shell > - script — it's all up to you. > + script — it's all up to you. There however are some things to consider > + when creating such a command, most of which are covered below. > </para> If we're changing this then we should remove the notion that it could be "as simple as a cp" because it really should *not* be. That is a dangerous and very poor recommendation to be making to our users. > <para> > @@ -636,7 +637,11 @@ test ! -f /mnt/server/archivedir/00000001000000A900000065 && cp pg_wal/0 > <productname>PostgreSQL</> will assume that the file has been > successfully archived, and will remove or recycle it. However, a nonzero > status tells <productname>PostgreSQL</> that the file was not archived; > - it will try again periodically until it succeeds. > + it will try again periodically until it succeeds. Note that an exit > + status of 128 or higher will cause the archiver to exit, resulting in a > + <literal>FATAL</> error in the server log. It will be restarted by the > + postmaster and continue where it left. E.g., <command>rsync</> is known > + for returning exit statuses of 255 on network issues. > </para> I agree with adding documentation about what happens with different exit status values. I don't believe we should make any mention of rsync. A sophisticated enough user to understand exit codes should be able to work out what the exit code is for whatever tool or tools they're using, it's not on us to document what the exit codes are for every possible tool (nor should one even use something as simple as a bare cp or rsync in archive_command). > <para> > @@ -696,6 +701,16 @@ test ! -f /mnt/server/archivedir/00000001000000A900000065 && cp pg_wal/0 > preserve the file name (<literal>%f</>). > </para> > > + <para> > + Depending on your specific requirements (and datacenter layout), you may > + want to make sure that the archived WAL segments have been written out to > + persistent storage before the <command>archive_command</> returns. > + Otherwise, a WAL segment that is assumed by <productname>PostgreSQL</> > + to be archived could be recycled or removed prematurely, rendering your > + archive incomplete (and thus disabling a recovery) in case of an outage of > + the archiving destination. > + </para> I dislike the tone of this. The wording I would suggest would be more along the lines of: The archive_command should only return success once the WAL segment has been completely copied, written out, and synced to persistent storage as PostgreSQL will recycle or remove the segment shortly after the archive_command returns. If the WAL segment is lost due to an outage of the archive server or other issue, any backup which was performed during the time that the WAL segment was written will be unusable and Point-In-Time-Recovery from an earlier backup will not be usable past the missing WAL segment. Thanks! Stephen
Attachment
Am 29.11.2016 um 14:43 schrieb Stephen Frost: > Greetings, > > * Gunnar "Nick" Bluth (gunnar.bluth@pro-open.de) wrote: >> Am 23.11.2016 um 21:41 schrieb Stephen Frost: >>> * Gunnar "Nick" Bluth (gunnar.bluth@pro-open.de) wrote: >>> One of the very important things that should be done as part of a backup >>> is to ensure that all of the archive files required to restore the >>> database to a consistent state are safely stored in the archive. If >>> that isn't done then it's possible that an incomplete archive may also >>> render backups invalid. >> >> Well, the need to have a complete archive is described in the docs >> already. Maybe the potential consequences of an incomplete archive >> should be pointed or more drastically...? > > We should probably add to "step 5" something about "verify that all WAL > files have been archived between the start and stop backup" or similar. +1 >> Now, the main purpose of my patch was to document a behaviour that many >> of us have run into, namely that FATAL error showing up in the log when >> the archive_command exits with RC > 127. It's a nuisance only, but it >> does send people on false tracks and should at least be mentioned in the >> documentation. > > I agree that we should add information to the documentation that certain > error codes are handled differently. > >> And since a couple of people does use rsync (or some wrappers around it) >> for archiving, and that is notoriously giving RCs > 127, it seems legit >> to at least mention it, no? > > The low-level API documentation should be focused on the API and not how > to (mis)use the API using common unix commands. > >> diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml >> index 6eaed1e..a8f574e 100644 >> --- a/doc/src/sgml/backup.sgml >> +++ b/doc/src/sgml/backup.sgml >> @@ -587,7 +587,8 @@ tar -cf backup.tar /usr/local/pgsql/data >> the administrator specify a shell command to be executed to copy a >> completed segment file to wherever it needs to go. The command could be >> as simple as a <literal>cp</>, or it could invoke a complex shell >> - script — it's all up to you. >> + script — it's all up to you. There however are some things to consider >> + when creating such a command, most of which are covered below. >> </para> > > If we're changing this then we should remove the notion that it could be > "as simple as a cp" because it really should *not* be. That is a > dangerous and very poor recommendation to be making to our users. That thought creeped on me as well... ;-) >> <para> >> @@ -636,7 +637,11 @@ test ! -f /mnt/server/archivedir/00000001000000A900000065 && cp pg_wal/0 >> <productname>PostgreSQL</> will assume that the file has been >> successfully archived, and will remove or recycle it. However, a nonzero >> status tells <productname>PostgreSQL</> that the file was not archived; >> - it will try again periodically until it succeeds. >> + it will try again periodically until it succeeds. Note that an exit >> + status of 128 or higher will cause the archiver to exit, resulting in a >> + <literal>FATAL</> error in the server log. It will be restarted by the >> + postmaster and continue where it left. E.g., <command>rsync</> is known >> + for returning exit statuses of 255 on network issues. >> </para> > > I agree with adding documentation about what happens with different exit > status values. I don't believe we should make any mention of rsync. A > sophisticated enough user to understand exit codes should be able to > work out what the exit code is for whatever tool or tools they're using, > it's not on us to document what the exit codes are for every possible > tool (nor should one even use something as simple as a bare cp or rsync > in archive_command). Fair enough. > >> <para> >> @@ -696,6 +701,16 @@ test ! -f /mnt/server/archivedir/00000001000000A900000065 && cp pg_wal/0 >> preserve the file name (<literal>%f</>). >> </para> >> >> + <para> >> + Depending on your specific requirements (and datacenter layout), you may >> + want to make sure that the archived WAL segments have been written out to >> + persistent storage before the <command>archive_command</> returns. >> + Otherwise, a WAL segment that is assumed by <productname>PostgreSQL</> >> + to be archived could be recycled or removed prematurely, rendering your >> + archive incomplete (and thus disabling a recovery) in case of an outage of >> + the archiving destination. >> + </para> > > I dislike the tone of this. The wording I would suggest would be more > along the lines of: > > The archive_command should only return success once the WAL segment has > been completely copied, written out, and synced to persistent storage as > PostgreSQL will recycle or remove the segment shortly after the > archive_command returns. If the WAL segment is lost due to an outage of > the archive server or other issue, any backup which was performed during > the time that the WAL segment was written will be unusable and > Point-In-Time-Recovery from an earlier backup will not be usable past > the missing WAL segment. Sounds good to me. I'll incorparate that and send -v6 soon! -- Gunnar "Nick" Bluth RHCE/SCLA Mobil +49 172 8853339 Email: gunnar.bluth@pro-open.de _____________________________________________________________ In 1984 mainstream users were choosing VMS over UNIX. Ten years later they are choosing Windows over UNIX. What part of that message aren't you getting? - Tom Payne