On Fri, Apr 09, 2021 at 04:53:01PM +0530, Bharath Rupireddy wrote:
> I feel that we can provide a high timeout value (It can be 1hr on the
> similar lines of using pg_sleep(3600) for crash tests in
> 013_crash_restart.pl with the assumption that the backend running that
> command will get killed with SIGQUIT) and make pg_terminate_backend
> wait. This unreasonably high timeout looks okay because of the
> assumption that the servers in the build farm will not take that much
> time to remove the backend from the system processes, so the function
> will return much earlier than that. If at all there's a server(which
> is impractical IMO) that doesn't remove the backend process even
> within 1hr, then that is anyways will fail with the warning.
You may not need a value as large as 1h for that :)
Looking at the TAP tests, some areas have been living with timeouts of
up to 180s. It is a matter of balance here, a timeout too long would
be annoying as it would make the detection of a problem longer for
machines that are stuck, and a too short value generates false
positives. 5 minutes gives some balance, but there is really no
perfect value.
> With the attached patch, we could remove the procedure
> terminate_backend_and_wait altogether. Thoughts?
That's clearly better, and logically it would work. As those tests
are new in 14, it may be a good idea to cleanup all that so as all the
branches have the same set of tests. Would people object to that?
--
Michael