Hi,
On 2017-09-19 12:13:54 -0400, Tom Lane wrote:
> IOW, the "$monitor" instance of psql did not complete making its
> connection until after the crash/restart cycle had occurred.
That'd be easy enough to fix...
Just something like
$monitor_stdin .= q[
SELECT $$am-i-up$$;
];
$monitor->pump until $monitor_stdout =~ /am-i-up/;
$monitor_stdout = '';
> So we're just sitting there waiting for a crash report that won't
> come. Which is another very serious deficiency in this test:
> lacking any sort of timeout, it will just freeze indefinitely
> if anything doesn't happen exactly the way it expects. From a
> buildfarm owner's standpoint, that's pretty damn unfriendly.
> It means having to manually unwedge your animals from time to time.
Note that I just copied the code for that from another test - this is
isn't unique to this test. I agree that it'd be good to add a timeout to
those pump calls.
> I'd like to ask you to revert this test, at least pending making
> it a whole lot more bulletproof.
Hm. Ok. That seems like an overreaction to me - the failure rate isn't
actualy that high so far. I'm happy to add both timeouts and "earlier
startup" of the $monitor, but I'd prefer to do so in-tree - I'd run the
test through 100+ iterations locally, without any of this showing up.
> We don't really need crash recovery testing in the buildfarm IMO ---
> we hackers crash the system plenty often enough to notice problems
> there.
I for one don't exercise that kind of crash restarts, my development
scripts all work with restart_after_crash = false. What I find more
concerning however is coverage of EXEC_BACKEND, which has far fewer
developers actively running it constantly.
Greetings,
Andres Freund
--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers