Bonjour Michaël,
> Please find attached an updated patch set, I have rebased that stuff
> on top of my recent commits to refactor the control file updates.
Patch applies cleanly, compiles, make check-world seems ok, doc build ok.
It would help if the patch includes a version number. I assume that this
is v7.
Doc looks ok.
Moving the controlfile looks like an effective way to prevent any
concurrent start, as the fs operation is probably atomic and especially if
external tools uses the same trick. However this is not the case yet, eg
"pg_resetwal" uses a "postmaster.pid" hack instead. Probably the method
could be unified, possibly with some functions in "controlfile_utils.c".
However, I think that there still is a race condition because of the order
in which it is implemented:
pg_checksums reads control file
pg_checksums checks control file contents...
** cluster may be started and the control file updated
pg_checksums moves the (updated) control file
pg_checksums proceeds on a running cluster
pg_checksums moves back the control file
pg_checksums updates the control file contents, overriding updates
I think that the correct way to handle this for enable/disable is:
pg_checksums moves the control file
pg_checksums reads, checks, proceeds, updates
pg_checksums moves back the control file
This probably means extending a little bit the update_controlfile function
to allow a suffix. No big deal.
Ok, this might not work, because of the following, less likely, race
condition:
postmaster opens control file RW
pg_checksums moves control file, posmater open file handle follows
...
So ISTM that we really need some locking to have something clean.
Why not always use "pgrename" instead of the strange pg_mv_file macro?
Help line about --check not simplified as suggested in a prior review,
although you said you would take it into account.
Tests look ok.
--
Fabien.