Thread: Slightly improve initdb --sync-only option's help message
When reading the output of `initdb --help` I could not clearly understand what the purpose of the --sync-only option was, until I read the documentation of initdb. -S, --sync-only only sync data directory Perhaps the confusion was caused by the fact that sync(hronization) means different things in different contexts, and many of those contexts apply to databases, and to data directories; time sync, data sync, replica sync, etc. I think it would be helpful if the help message was slightly more descriptive. Some options: Used in patch: only sync data directory; does not modify any data To match the wording of --sync-only option: write contents of data directory to disk; helpful after --no-sync option Clearly specify the system operation used for the option perform fsync on data directory; helpful after --no-sync option Best regards, -- Gurjeet Singh http://gurjeet.singh.im/
Attachment
On 7/6/21, 7:02 PM, "Gurjeet Singh" <gurjeet@singh.im> wrote: > I think it would be helpful if the help message was slightly more > descriptive. Some options: > > Used in patch: > only sync data directory; does not modify any data > > To match the wording of --sync-only option: > write contents of data directory to disk; helpful after --no-sync option > > Clearly specify the system operation used for the option > perform fsync on data directory; helpful after --no-sync option I think the help message should say exactly what the option does and should avoid saying what it does not do or how it may be useful. I would suggest the following to match the initdb docs [0]: -S, --sync-only safely write all database files to disk and exit IMO the note about the option being helpful after using the --no-sync option would fit better in the docs, but I'm struggling to think of a use case for using --no-sync and then calling initdb again with --sync-only. Why wouldn't you just leave out --no-sync the first time? Nathan [0] https://www.postgresql.org/docs/devel/app-initdb.html
On Thu, Jul 22, 2021 at 10:32:18PM +0000, Bossart, Nathan wrote: > On 7/6/21, 7:02 PM, "Gurjeet Singh" <gurjeet@singh.im> wrote: > > I think it would be helpful if the help message was slightly more > > descriptive. Some options: > > > > Used in patch: > > only sync data directory; does not modify any data > > > > To match the wording of --sync-only option: > > write contents of data directory to disk; helpful after --no-sync option > > > > Clearly specify the system operation used for the option > > perform fsync on data directory; helpful after --no-sync option > > I think the help message should say exactly what the option does and > should avoid saying what it does not do or how it may be useful. I > would suggest the following to match the initdb docs [0]: > > -S, --sync-only safely write all database files to disk and exit > > IMO the note about the option being helpful after using the --no-sync > option would fit better in the docs, but I'm struggling to think of a > use case for using --no-sync and then calling initdb again with > --sync-only. Why wouldn't you just leave out --no-sync the first > time? It's to allow safely running bulk loading with fsync=off - if the bulk load fails, you can wipe out the partially-loaded cluster and start over. But then transitioning to a durable state requires not just setting fsync=on, which enables future fsync calls. It also requires syncing all dirty buffers. doc/src/sgml/config.sgml- <para> doc/src/sgml/config.sgml- For reliable recovery when changing <varname>fsync</varname> doc/src/sgml/config.sgml- off to on, it is necessary to force all modified buffers in the doc/src/sgml/config.sgml- kernel to durable storage. This can be done while the cluster doc/src/sgml/config.sgml- is shutdown or while <varname>fsync</varname> is on by running <command>initdb doc/src/sgml/config.sgml: --sync-only</command>, running <command>sync</command>, unmounting the doc/src/sgml/config.sgml- file system, or rebooting the server. doc/src/sgml/config.sgml- </para> -- Justin
On 7/22/21, 6:31 PM, "Justin Pryzby" <pryzby@telsasoft.com> wrote: > On Thu, Jul 22, 2021 at 10:32:18PM +0000, Bossart, Nathan wrote: >> IMO the note about the option being helpful after using the --no-sync >> option would fit better in the docs, but I'm struggling to think of a >> use case for using --no-sync and then calling initdb again with >> --sync-only. Why wouldn't you just leave out --no-sync the first >> time? > > It's to allow safely running bulk loading with fsync=off - if the bulk load > fails, you can wipe out the partially-loaded cluster and start over. > But then transitioning to a durable state requires not just setting fsync=on, > which enables future fsync calls. It also requires syncing all dirty buffers. Right. Perhaps the documentation for --sync-only could mention this use-case instead. Safely write all database files to disk and exit. This does not perform any of the normal initdb operations. Generally, this option is useful for ensuring reliable recovery after changing fsync from off to on. Nathan
On 7/23/21, 9:09 AM, "Bossart, Nathan" <bossartn@amazon.com> wrote: > On 7/22/21, 6:31 PM, "Justin Pryzby" <pryzby@telsasoft.com> wrote: >> On Thu, Jul 22, 2021 at 10:32:18PM +0000, Bossart, Nathan wrote: >>> IMO the note about the option being helpful after using the --no-sync >>> option would fit better in the docs, but I'm struggling to think of a >>> use case for using --no-sync and then calling initdb again with >>> --sync-only. Why wouldn't you just leave out --no-sync the first >>> time? >> >> It's to allow safely running bulk loading with fsync=off - if the bulk load >> fails, you can wipe out the partially-loaded cluster and start over. >> But then transitioning to a durable state requires not just setting fsync=on, >> which enables future fsync calls. It also requires syncing all dirty buffers. > > Right. Perhaps the documentation for --sync-only could mention this > use-case instead. > > Safely write all database files to disk and exit. This does > not perform any of the normal initdb operations. Generally, > this option is useful for ensuring reliable recovery after > changing fsync from off to on. Here are my suggestions in patch form. Nathan
Attachment
On Mon, Jul 26, 2021 at 11:05 AM Bossart, Nathan <bossartn@amazon.com> wrote: > Here are my suggestions in patch form. + printf(_(" -S, --sync-only safely write all database files to disk and exit\n")); Not your patch's fault, but the word "write" does not seem to convey the true intent of the option, because generally a "write" operation is still limited to dirtying the OS buffers, and does not guarantee sync-to-disk. It'd be better if the help message said, either "flush all database files to disk and exit",or "sync all database files to disk and exit". Best regards, -- Gurjeet Singh http://gurjeet.singh.im/
> On 26 Jul 2021, at 20:05, Bossart, Nathan <bossartn@amazon.com> wrote: > Here are my suggestions in patch form. - printf(_(" -S, --sync-only only sync data directory\n")); + printf(_(" -S, --sync-only safely write all database files to disk and exit\n")); I think removing the word "only" here is a net loss, since it IMO clearly conveyed that this option isn't doing any of the other initdb duties. The documentation states it in more words, but the help output must be brief so I think "only" is a good keyword. -- Daniel Gustafsson https://vmware.com/
On 7/29/21, 2:11 AM, "Daniel Gustafsson" <daniel@yesql.se> wrote: > I think removing the word "only" here is a net loss, since it IMO clearly > conveyed that this option isn't doing any of the other initdb duties. The > documentation states it in more words, but the help output must be brief so I > think "only" is a good keyword. I've attached a new version of the patch in which I've attempted to address both sets of feedback. Nathan
Attachment
> On 30 Jul 2021, at 01:37, Bossart, Nathan <bossartn@amazon.com> wrote: > > On 7/29/21, 2:11 AM, "Daniel Gustafsson" <daniel@yesql.se> wrote: >> I think removing the word "only" here is a net loss, since it IMO clearly >> conveyed that this option isn't doing any of the other initdb duties. The >> documentation states it in more words, but the help output must be brief so I >> think "only" is a good keyword. > > I've attached a new version of the patch in which I've attempted to > address both sets of feedback. LGTM. I took the liberty to rephrase the "and exit" part of the initdb help output match the other exiting options in there. Barring objections, I think this is ready. -- Daniel Gustafsson https://vmware.com/
Attachment
On 7/30/21, 2:22 AM, "Daniel Gustafsson" <daniel@yesql.se> wrote: > LGTM. I took the liberty to rephrase the "and exit" part of the initdb help > output match the other exiting options in there. Barring objections, I think > this is ready. LGTM. Thanks! Nathan
> On 30 Jul 2021, at 18:27, Bossart, Nathan <bossartn@amazon.com> wrote: > > On 7/30/21, 2:22 AM, "Daniel Gustafsson" <daniel@yesql.se> wrote: >> LGTM. I took the liberty to rephrase the "and exit" part of the initdb help >> output match the other exiting options in there. Barring objections, I think >> this is ready. > > LGTM. Thanks! Pushed to master, thanks! -- Daniel Gustafsson https://vmware.com/
On Mon, Aug 16, 2021 at 4:42 AM Daniel Gustafsson <daniel@yesql.se> wrote: > > > On 30 Jul 2021, at 18:27, Bossart, Nathan <bossartn@amazon.com> wrote: > > > > On 7/30/21, 2:22 AM, "Daniel Gustafsson" <daniel@yesql.se> wrote: > >> LGTM. I took the liberty to rephrase the "and exit" part of the initdb help > >> output match the other exiting options in there. Barring objections, I think > >> this is ready. > > > > LGTM. Thanks! > > Pushed to master, thanks! Thank you Daniel and Nathan! Much appreciated. Best regards, -- Gurjeet Singh http://gurjeet.singh.im/