Thread: Too-short timeouts in test code
We've had more buildfarm failures due to hard-coded, short timeouts: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=chipmunk&dt=2018-10-13%2021%3A06%3A58 10s timeout while running pg_recvlogical http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet&dt=2018-12-03%2005%3A52%3A12 30s timeout while running pg_recvlogical https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2018-11-30%2014%3A31%3A18 60s timeout in isolationtester try_complete_step() The 180s timeout in poll_query_until has been trouble-free since 2a0f89c introduced it two years ago. I plan to raise the timeouts in question to 180s, as attached.
Attachment
On Sat, Dec 08, 2018 at 04:16:01PM -0800, Noah Misch wrote: > The 180s timeout in poll_query_until has been trouble-free since 2a0f89c > introduced it two years ago. I plan to raise the timeouts in question to > 180s, as attached. That's annoying, thanks for tracking those down. > @@ -72,7 +72,7 @@ my $endpos = $node_master->safe_psql('postgres', > print "waiting to replay $endpos\n"; > > my $stdout_recv = $node_master->pg_recvlogical_upto( > - 'postgres', 'test_slot', $endpos, 10, > + 'postgres', 'test_slot', $endpos, 180, > 'include-xids' => '0', > 'skip-empty-xacts' => '1'); Instead of allowing callers of pg_recvlogical_upto() to define the timeout they want, perhaps it would be better to hardcode the timeout limit within the routine? -- Michael
Attachment
On Sun, Dec 09, 2018 at 11:14:12AM +0900, Michael Paquier wrote: > On Sat, Dec 08, 2018 at 04:16:01PM -0800, Noah Misch wrote: > > @@ -72,7 +72,7 @@ my $endpos = $node_master->safe_psql('postgres', > > print "waiting to replay $endpos\n"; > > > > my $stdout_recv = $node_master->pg_recvlogical_upto( > > - 'postgres', 'test_slot', $endpos, 10, > > + 'postgres', 'test_slot', $endpos, 180, > > 'include-xids' => '0', > > 'skip-empty-xacts' => '1'); > > Instead of allowing callers of pg_recvlogical_upto() to define the > timeout they want, perhaps it would be better to hardcode the timeout > limit within the routine? pg_recvlogical_upto() has the ability to make timeout non-fatal, by calling it in an array context. No in-tree test uses that. Out-of-tree code using that feature would likely benefit from the ability to set timeout duration. Hence, I'm inclined not remove the timeout duration parameter.
On Sat, Dec 08, 2018 at 06:24:35PM -0800, Noah Misch wrote: > pg_recvlogical_upto() has the ability to make timeout non-fatal, by calling it > in an array context. No in-tree test uses that. Out-of-tree code using that > feature would likely benefit from the ability to set timeout duration. Hence, > I'm inclined not remove the timeout duration parameter. No objections with this line of thoughts. The rest of the patch looks fine to me. -- Michael