Thread: Simplify final sync in pg_rewind's target folder and add --no-sync
Hi all, While looking at pg_rewind code, I have been surprised to find that the final fsync done on the target's data folder is done using initdb -S via a system() call. This is in my opinion overcomplicated because we have a dedicated API in fe_utils able to do a fsync on a data folder, called fsync_pgdata() that I have implemented when working on data durability for the other backup tools. So I would like to simplify the code as attached. One difference that this patch introduces is that a failed sync is not considered as a failure, still failures are reported to stderr. This new behavior is actually more consistent with what happens in pg_dump and pg_basebackup. And we have decided previously to do so, see here for more details on the discussion: https://www.postgresql.org/message-id/CAB7nPqQ_B0j3n1t%3D8c1ZLHXF1b8Tf4XsXoUC9bP9t5Hab--SMg%40mail.gmail.com An extra thing I have noticed, which is I think an oversight, is that there is no --no-sync option in pg_rewind. Like the other binaries, this is useful to reduce the I/O effort when running tests. Both things are implemented as attached. I am of course not pushing for integrating that patch in v11 even if it is straight-forward, so I'll park it in the next future commit fest. Thanks, -- Michael
Attachment
On Sun, Mar 25, 2018 at 09:26:07PM +0900, Michael Paquier wrote: > Both things are implemented as attached. I am of course not pushing for > integrating that patch in v11 even if it is straight-forward, so I'll > park it in the next future commit fest. Patch has roted per the feedback of Mr Robot here: http://cfbot.cputube.org/michael-paquier.html So attached is a refreshed version. -- Michael
Attachment
On 25/03/18 15:26, Michael Paquier wrote: > Hi all, > > While looking at pg_rewind code, I have been surprised to find that the > final fsync done on the target's data folder is done using initdb -S via > a system() call. This is in my opinion overcomplicated because we have > a dedicated API in fe_utils able to do a fsync on a data folder, called > fsync_pgdata() that I have implemented when working on data durability > for the other backup tools. So I would like to simplify the code as > attached. > > One difference that this patch introduces is that a failed sync is not > considered as a failure, still failures are reported to stderr. This > new behavior is actually more consistent with what happens in pg_dump > and pg_basebackup. And we have decided previously to do so, see here > for more details on the discussion: > https://www.postgresql.org/message-id/CAB7nPqQ_B0j3n1t%3D8c1ZLHXF1b8Tf4XsXoUC9bP9t5Hab--SMg%40mail.gmail.com > > An extra thing I have noticed, which is I think an oversight, is that > there is no --no-sync option in pg_rewind. Like the other binaries, > this is useful to reduce the I/O effort when running tests. Yeah, let's be consistent with the other utilities, on both of those things. > Both things are implemented as attached. I am of course not pushing for > integrating that patch in v11 even if it is straight-forward, so I'll > park it in the next future commit fest. Looks good to me. I'll mark this as "ready for committer". - Heikki
On Mon, Jul 09, 2018 at 04:38:11PM +0300, Heikki Linnakangas wrote: > Looks good to me. I'll mark this as "ready for committer". Thanks Heikki for the review, I have committed both things after splitting it into two commits for clarity, and fixing a comment as fsync_pgdata also initiates a write-back on its first pass when the option is available. -- Michael