Re: Add a different archive_command example for Linux / Unix - Mailing list pgsql-docs
From | gparc@free.fr |
---|---|
Subject | Re: Add a different archive_command example for Linux / Unix |
Date | |
Msg-id | 705658729.361810855.1707401435362.JavaMail.zimbra@free.fr Whole thread Raw |
In response to | Re: Add a different archive_command example for Linux / Unix (Stephen Frost <sfrost@snowman.net>) |
Responses |
Re: Add a different archive_command example for Linux / Unix
|
List | pgsql-docs |
Thanks Stephen for your detailed reply and broad perspective. But I see the cp example command used **as is** most of the time. Regards Gilles ----- Mail original ----- > De: "Stephen Frost" <sfrost@snowman.net> > À: "gparc" <gparc@online.fr>, "pgsql-docs" <pgsql-docs@lists.postgresql.org> > Envoyé: Jeudi 8 Février 2024 13:00:13 > Objet: Re: Add a different archive_command example for Linux / Unix > Greetings, > > * PG Doc comments form (noreply@postgresql.org) wrote: >> Hello, >> in >> https://www.postgresql.org/docs/16/continuous-archiving.html#BACKUP-ARCHIVING-WAL >> the example given could be improved for Linux / Unix environment. >> >> As cp command is buffered it means it could return success although the data >> didn't reach the disk >> which puts backups at risk. > > Yup. > >> I propose to use dd command with its fsync option. >> >> So the actual equivalent example would be : >> >> archive_command = 'dd if=%p of=/mnt/server/archivedir/%f bs=1M >> conv=fsync,excl status=none' # Unix >> >> What do you think ? > > This doesn't fsync the directory though, for one thing, and there are > other considerations beyond that when having archive_command run and > more generally when doing backups with PG. In short, the example in the > documentation is not to ever be used but is intended to show how the > replacement is done when the command is called, so that backup tool > authors know how it works. > > In reality though, to write backup software for PG, you really do need > to know PG in much more detail than the documentation provides, which > means reading the source- for example, backup software should be > checking the pg_control file's CRC as it's possible to read it just as > it's being written and end up with an invalid pg_control file in the > backup, making the backup invalid. There's been some discussion about > how to improve this situation but nothing exists today from PG, so > backup authors have to handle it. This is just one example, there are > lots of others- unlogged table handling, temporary file handling, etc, > etc. > > I'd strongly recommend using one of the existing well maintained backup > tools which have been written specifically for PG for your backups. > Writing a new backup tool for PG is a good bit of work and isn't > really reasonable to do with shell scripts or simple unix commands. > > I do feel that we could improve the documentation around this by > dropping comments like "using any convenient file-system-backup tool > such as tar or cpio" as those don't, for example, support any way to > reasonably deal with unlogged tables by themselves. Technically you > could scan the data directory and provide an exclude file, or not > include unlogged table files in the list of files to include, but then > you're starting to get into things like how to tell if a file is > associated with an unlogged table or not and while that's deep in the > documentation, we don't make any mention or reference to unlogged tables > in the backup documentation. Perhaps an addition to the low-level > documentation under 'Backing Up The Data Directory' along these lines > would be helpful: > > ######### > You should omit any unlogged relation files (other than the 'init' fork) > as they will be reset to be empty upon recovery and backing them up will > simply increase your backup size (potentially significantly) and slow > down the restore process. Unlogged tables have an init fork (link to > storage-init.html) which is a file with the same filenode number as the > relation but with a suffix added of '_init' (link to > storage-file-layout.html). When an '_init' fork exists for a given > relation, the '_init' file should be included in the backup, but all > other files for that relation (the 'main' fork, which does not have a > suffix, and all other forks which exist other than the 'init' fork) > should be excluded from the backup. > ######### > > There's also no way for tar or cpio to directly validate that the copy > of pg_control that they copied is valid. Worse, on a restore, they'll > restore pg_control more-or-less whenever and then if the restore doesn't > complete for whatever reason, you might end up with a cluster that can > be started, run for a while, but be missing whole tables. While it was > only demonstrated relatively recently that the pg_control file can, in > fact, be invalid when read during a backup, it's a real issue that's > been around for, probably, forever, and there isn't really a good way to > address it today. Still, perhaps we should include in the > documentation, again under the 'Backup Up The Data Directory', something > like: > > ######### > You must be sure to check that the copy of pg_control which was > copied is valid. The pg_controldata tool can be used for this purpose- > after making the copy of the 'data_dir/global/pg_control' file which > will be included in the backup, restore it into a new directory > 'test_data_dir/global' and then run 'pg_controldata' on 'test_data_dir' > and ensure that no error is returned. If an error is returned, attempt > to re-copy the 'data_dir/global/pg_control' file and test again (during > the backup), or fail the backup entirely. A backup without a valid > pg_control file is not able to be restored. > > Further, on restore, it is strongly recommended to restore the > pg_control file last and only after the successfully restoring the rest > of the database, to prevent the database from being started either while > the restore is happening or in the event that not all of the files which > were part of the backup were able to be restored. > ######### > > Of course, there's more, such as the lack of any discussion about making > sure to have a separate manifest of all of the files that were copied as > part of the backup to allow you to make sure that, for example, the tar > file that was created for the backup didn't just get truncated somewhere > along the way. The above alludes to this idea, at least. > > I will note that while we don't talk about all of these things in the > documentation, pg_basebackup and pg_receivewal do handle things like > doing an fsync() after the backup has completed or when receiving WAL, > including of directories, skipping unlogged relations and temporary > files, includes a manifest with a per-file checksum which the > pg_verifybackup tool can be used to check, etc. While pg_basebackup > doesn't yet handle the issue of an invalid read of pg_control, there's > ongoing discussion about how to address that and patches have been > proposed to deal with it. Hopefully something will be done soon there > as pg_basebackup, particularly when run against a replica where the > pg_control file is rewritten much more frequently, in certain cases, > seems to be at particular risk. This isn't to say it's very likely- > it's not, just that it is a thing that could happen and could lead to an > invalid backup. > > Really, rewriting the entire "how to perform a low-level backup" and > re-focusing that section of the documentation to be for backup tool > authors rather than having it seem like it's a unix administrator or > general DBA's documentation for performing backups would really go a > long way towards improving the situation, for my ... slightly more than > 2 cents (or your favorite currency), anyway. > > Thanks! > > Stephen
pgsql-docs by date: