Thread: pg_resetwal tests, logging, and docs update
I noticed that pg_resetwal has poor test coverage. There are some TAP tests, but they all run with -n, so they don't actually test the full functionality. (There is a non-dry-run call of pg_resetwal in the recovery test suite, but that is incidental.) So I added a bunch of more tests to test all the different options and scenarios. That also led to adding more documentation about more details how some of the options behave, and some tweaks in the program output. It's attached as one big patch, but it could be split into smaller pieces, depending what gets agreement.
Attachment
Hi, > I noticed that pg_resetwal has poor test coverage. There are some TAP > tests, but they all run with -n, so they don't actually test the full > functionality. (There is a non-dry-run call of pg_resetwal in the > recovery test suite, but that is incidental.) > > So I added a bunch of more tests to test all the different options and > scenarios. That also led to adding more documentation about more > details how some of the options behave, and some tweaks in the program > output. > > It's attached as one big patch, but it could be split into smaller > pieces, depending what gets agreement. All in all the patch looks OK but I have a couple of nitpicks. ``` + working on a data directory in an unclean shutdown state or with a corrupt + control file. ``` ``` + After running this command on a data directory with corrupted WAL or a + corrupt control file, ``` I'm not a native English speaker but shouldn't it be "corruptED control file"? ``` + Force <command>pg_resetwal</command> to proceed even in situations where + it could be dangerous, ``` "where" is probably fine but wouldn't "WHEN it could be dangerous" be better? ``` + // FIXME: why 2? if (set_oldest_commit_ts_xid < 2 && ``` Should we rewrite this to < FrozenTransactionId ? ``` +$mult = 32 * $blcksz * 4; # FIXME ``` Unless I'm missing something this $mult value is not used. Is it really needed here? -- Best regards, Aleksander Alekseev
On 13.09.23 16:36, Aleksander Alekseev wrote: > ``` > + // FIXME: why 2? > if (set_oldest_commit_ts_xid < 2 && > ``` > > Should we rewrite this to < FrozenTransactionId ? That's what I suspect, but we should confirm that. > > ``` > +$mult = 32 * $blcksz * 4; # FIXME > ``` > > Unless I'm missing something this $mult value is not used. Is it > really needed here? The FIXME is that I think a multiplier *should* be applied somehow. See also the FIXME in the documentation for the -c option.
On 13.09.23 16:36, Aleksander Alekseev wrote: > All in all the patch looks OK but I have a couple of nitpicks. > > ``` > + working on a data directory in an unclean shutdown state or with a corrupt > + control file. > ``` > > ``` > + After running this command on a data directory with corrupted WAL or a > + corrupt control file, > ``` > > I'm not a native English speaker but shouldn't it be "corruptED control file"? fixed > > ``` > + Force <command>pg_resetwal</command> to proceed even in situations where > + it could be dangerous, > ``` > > "where" is probably fine but wouldn't "WHEN it could be dangerous" be better? Hmm, I think I like "where" better. Attached is an updated patch set where I have split the changes into smaller pieces. The last two patches still have some open questions about what certain constants mean etc. The other patches should be settled.
Attachment
- v2-0001-pg_resetwal-Update-an-obsolete-comment.patch
- v2-0002-pg_resetwal-Improve-error-with-wrong-missing-data.patch
- v2-0003-pg_resetwal-Regroup-help-output.patch
- v2-0004-pg_resetwal-Use-frontend-logging-API.patch
- v2-0005-doc-Improve-documentation-about-pg_resetwal-f-opt.patch
- v2-0006-doc-pg_resetwal-Add-comments-how-the-multipliers-.patch
- v2-0007-pg_resetwal-Add-more-tests-and-test-coverage.patch
Hi, > Hmm, I think I like "where" better. OK. > Attached is an updated patch set where I have split the changes into > smaller pieces. The last two patches still have some open questions > about what certain constants mean etc. The other patches should be settled. The patches 0001..0005 seem to be ready and rather independent. I suggest merging them and continue discussing the rest of the patches. -- Best regards, Aleksander Alekseev
On 26.09.23 17:19, Aleksander Alekseev wrote: >> Attached is an updated patch set where I have split the changes into >> smaller pieces. The last two patches still have some open questions >> about what certain constants mean etc. The other patches should be settled. > > The patches 0001..0005 seem to be ready and rather independent. I > suggest merging them and continue discussing the rest of the patches. I have committed 0001..0005, and also posted a separate patch to discuss and correct the behavior of the -c option. I expect that we will carry over this patch set to the next commit fest.
On 29.09.23 10:02, Peter Eisentraut wrote: > On 26.09.23 17:19, Aleksander Alekseev wrote: >>> Attached is an updated patch set where I have split the changes into >>> smaller pieces. The last two patches still have some open questions >>> about what certain constants mean etc. The other patches should be >>> settled. >> >> The patches 0001..0005 seem to be ready and rather independent. I >> suggest merging them and continue discussing the rest of the patches. > > I have committed 0001..0005, and also posted a separate patch to discuss > and correct the behavior of the -c option. I expect that we will carry > over this patch set to the next commit fest. Here are updated versions of the remaining patches. I took out the "FIXME" notes about the multipliers applying to the -c option and replaced them by gentler comments. I don't intend to resolve those issues here.
Attachment
Hi, > Here are updated versions of the remaining patches. I took out the > "FIXME" notes about the multipliers applying to the -c option and > replaced them by gentler comments. I don't intend to resolve those > issues here. The patch LGTM. However, postgresql:pg_resetwal test suite doesn't pass on Windows according to cfbot. Seems to be a matter of picking a more generic regular expression: ``` at C:/cirrus/src/bin/pg_resetwal/t/001_basic.pl line 54. 'pg_resetwal: error: could not change directory to "foo": No such file or directory doesn't match '(?^:error: could not read permissions of directory)' ``` Should we simply use something like: ``` qr/error: could not (read|change).* directory/ ``` ... instead? -- Best regards, Aleksander Alekseev
On 30.10.23 12:55, Aleksander Alekseev wrote: > The patch LGTM. However, postgresql:pg_resetwal test suite doesn't > pass on Windows according to cfbot. Seems to be a matter of picking a > more generic regular expression: > > ``` > at C:/cirrus/src/bin/pg_resetwal/t/001_basic.pl line 54. > 'pg_resetwal: error: could not change directory to > "foo": No such file or directory > doesn't match '(?^:error: could not read permissions of directory)' > ``` > > Should we simply use something like: > > ``` > qr/error: could not (read|change).* directory/ > ``` Hmm. I think maybe we should fix the behavior of GetDataDirectoryCreatePerm() to be more consistent between Windows and non-Windows. This is usually the first function a program uses on the proposed data directory, so it's also responsible for reporting if the data directory does not exist. But then on Windows, because the function does nothing, those error scenarios end up on quite different code paths, and I'm not sure if those are really checked that carefully. I think we can make this more robust if we have GetDataDirectoryCreatePerm() still run the stat() call on the proposed data directory and report the error. See attached patch.
Attachment
Hi, > Hmm. I think maybe we should fix the behavior of > GetDataDirectoryCreatePerm() to be more consistent between Windows and > non-Windows. This is usually the first function a program uses on the > proposed data directory, so it's also responsible for reporting if the > data directory does not exist. But then on Windows, because the > function does nothing, those error scenarios end up on quite different > code paths, and I'm not sure if those are really checked that carefully. > I think we can make this more robust if we have > GetDataDirectoryCreatePerm() still run the stat() call on the proposed > data directory and report the error. See attached patch. Yep, that would be much better. Attaching all three patches together in order to make sure cfbot is still happy with them while the `master` branch is evolving. Assuming cfbot will have no complaints I suggest merging them. -- Best regards, Aleksander Alekseev
Attachment
On 01.11.23 12:12, Aleksander Alekseev wrote: > Hi, > >> Hmm. I think maybe we should fix the behavior of >> GetDataDirectoryCreatePerm() to be more consistent between Windows and >> non-Windows. This is usually the first function a program uses on the >> proposed data directory, so it's also responsible for reporting if the >> data directory does not exist. But then on Windows, because the >> function does nothing, those error scenarios end up on quite different >> code paths, and I'm not sure if those are really checked that carefully. >> I think we can make this more robust if we have >> GetDataDirectoryCreatePerm() still run the stat() call on the proposed >> data directory and report the error. See attached patch. > > Yep, that would be much better. > > Attaching all three patches together in order to make sure cfbot is > still happy with them while the `master` branch is evolving. > > Assuming cfbot will have no complaints I suggest merging them. Done, thanks.