I went back to take a closer look at why the existing regression tests
failed to detect this problem. As Michael already noted,
src/bin/pg_ctl/t/004_logrotate.pl does try to exercise the logging
collector, and we do have a couple of Windows machines that run the
TAP tests, so how'd it escape notice?
The answer indeed seems to be that it *was* noticed, but Andrew failed
to realize what he was seeing:
Author: Andrew Dunstan <andrew@dunslane.net>
Branch: master [d611175e5] 2019-03-03 18:19:44 -0500
Don't do pg_ctl logrotate test on Windows
The test crashes and burns quite badly, for some reason, but even if it
didn't it wouldn't work, since Windows doesn't let you rename a file
held by a running process.
But even when we do run it, it fails to detect any problem, because
it's only testing for the existence of a log file not whether anything
ever gets written into it. As it happens, SysLogger_Start() creates
the initial log file right in the postmaster (to validate the logging
parameters), so as 004_logrotate.pl is constituted it fails to prove
that the syslogger process got launched at all. That's why it
"passes" even when the syslogger is crashing instantly on launch.
Given the current unpleasantness, I think it's quite important that
we fix 004_logrotate.pl so that it (a) poses an end-to-end test
and (b) works on Windows. I think (a) could be managed by having
the test execute guaranteed-to-fail SQL commands ("select 1/0" or
the like) and making it look into the log file to verify that the
expected error messages appear there. (b) is a little harder in
view of Andrew's point that we can't just summarily rename or unlink
the log file. But I don't much like using a fixed log file name
in the test anyway. I propose that we let it use the default logfile
pattern (postgresql-%Y-%m-%d_%H%M%S.log), and that we just sleep for a
couple of seconds before requesting rotation, and that we verify that
the file name actually changed. We can avoid difficult issues of
guessing what file name got used by having the test script look into
"current_logfiles" to get the name --- which is another feature that
there's exactly zero test coverage of right now, so that seems like
a win all around.
Thoughts, better ideas?
regards, tom lane