Regarding the archive_command in "continuois archiving" chapter. - Mailing list pgsql-docs
| From | PG Doc comments form |
|---|---|
| Subject | Regarding the archive_command in "continuois archiving" chapter. |
| Date | |
| Msg-id | 177799183569.798.92791159894803781@wrigleys.postgresql.org Whole thread |
| Responses |
Re: Regarding the archive_command in "continuois archiving" chapter.
|
| List | pgsql-docs |
The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/17/continuous-archiving.html Description: The example provided for archive_command for https://www.postgresql.org/docs/18/continuous-archiving.html#BACKUP-ARCHIVING-WAL should include commands to ensure that archiving has successfully written the disk, even during an unexpected shutdown. Currently, PostgreSQL does not force a write to the file system for archived WAL files as it does for data files. During periods of high write activity, PostgreSQL might lose files that were confirmed as written but remained in the operating system cache. This loss affects Point-in-Time Recovery (PITR) and the use of pg_rewind. While many servers utilize power protection such as UPS or battery-backed RAID controllers, the OS cache remains unprotected during a forced shutdown without a sync call. This issue does not affect crash recovery or streaming replication, but it impacts PITR and pg_rewind. I recommend improving the documentation examples or including a visible warning. The cp command in Linux should be treated as a stub; it should not be used without an explicit sync call or by ensuring the destination directory is mounted in an specific file system with the sync option in /etc/fstab. It could also include references to usage of other tools (rsync --fsync, remote archiving or barman) or the options needed to use it with a nfs mounted directory, but I might be too open. This what would look like (I've used ai to help me that this gets written in proper english) ----- 25.3.1. Setting Up WAL Archiving In an abstract sense, a running PostgreSQL system produces an indefinitely long sequence of WAL records. The system physically divides this sequence into WAL segment files, which are normally 16MB apiece (although the segment size can be altered during initdb). The segment files are given numeric names that reflect their position in the abstract WAL sequence. When not using WAL archiving, the system normally creates just a few segment files and then “recycles” them by renaming no-longer-needed segment files to higher segment numbers. It is assumed that segment files whose contents precede the last checkpoint are no longer of interest and can be recycled. When archiving WAL data, we need to capture the contents of each segment file once it is filled and save that data somewhere before the segment file is recycled for reuse. Depending on the application and the available hardware, there could be many different ways of “saving the data somewhere”: we could copy the segment files to an NFS-mounted directory on another machine, write them onto a tape drive, or batch them together and burn them onto CDs. To provide the database administrator with flexibility, PostgreSQL does not make assumptions about how the archiving will be done. Instead, PostgreSQL lets the administrator specify a shell command or an archive library to be executed to copy a completed segment file to its destination. This could be a shell command using cp, or it could invoke a complex C function. To enable WAL archiving, set the wal_level configuration parameter to replica or higher, archive_mode to on, and specify the shell command to use in the archive_command configuration parameter. In archive_command, %p is replaced by the path name of the file to archive, while %f is replaced by only the file name. The simplest useful command is something like: Plaintext archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f' # Unix archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"' # Windows IMPORTANT: The simple cp or copy commands return success as soon as the data is written to the operating system cache. However, PostgreSQL does not automatically force these archived files to be flushed to physical storage. In the event of an unexpected system shutdown or power failure, files confirmed as archived might be lost from the OS cache, leading to data loss in Point-in-Time Recovery (PITR) or issues with pg_rewind. To ensure data persistence, the command should include a step to synchronize the file to disk. On Unix/Linux systems, a more robust command is: Plaintext archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f && sync /mnt/server/archivedir/%f' The sync command must operate at the file level, and sync version should be validated to not write the entire cache. Alternatively, the destination file system can be mounted with the sync option in the system configuration (e.g., /etc/fstab), though this may impact overall performance. Administrators must validate the durability and behavior of the chosen archival method on their specific operating system and storage architecture, as file system synchronization guarantees vary significantly across platforms. After the %p and %f parameters have been replaced, the actual command executed might look like this: test ! -f /mnt/server/archivedir/00000001000000A900000065 && cp pg_wal/00000001000000A900000065 /mnt/server/archivedir/00000001000000A900000065 && sync /mnt/server/archivedir/00000001000000A900000065 A similar command will be generated for each new file to be archived.
pgsql-docs by date: