On Mon, Apr 12, 2021 at 11:18 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Apr 09, 2021 at 04:53:01PM +0530, Bharath Rupireddy wrote:
> > I feel that we can provide a high timeout value (It can be 1hr on the
> > similar lines of using pg_sleep(3600) for crash tests in
> > 013_crash_restart.pl with the assumption that the backend running that
> > command will get killed with SIGQUIT) and make pg_terminate_backend
> > wait. This unreasonably high timeout looks okay because of the
> > assumption that the servers in the build farm will not take that much
> > time to remove the backend from the system processes, so the function
> > will return much earlier than that. If at all there's a server(which
> > is impractical IMO) that doesn't remove the backend process even
> > within 1hr, then that is anyways will fail with the warning.
>
> You may not need a value as large as 1h for that :)
>
> Looking at the TAP tests, some areas have been living with timeouts of
> up to 180s. It is a matter of balance here, a timeout too long would
> be annoying as it would make the detection of a problem longer for
> machines that are stuck, and a too short value generates false
> positives. 5 minutes gives some balance, but there is really no
> perfect value.
I changed to 5min. If at all there's any server that would take more
than 5min to remove a process from the system processes list, then it
would see a warning on timeout.
> > With the attached patch, we could remove the procedure
> > terminate_backend_and_wait altogether. Thoughts?
>
> That's clearly better, and logically it would work. As those tests
> are new in 14, it may be a good idea to cleanup all that so as all the
> branches have the same set of tests. Would people object to that?
Yes, these tests are introduced in v14, +1 to clean them with this
patch on v14 as well along with master.
Attaching v4, please review further.
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com