Re: Maybe BF "timedout" failures are the client script's fault? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Maybe BF "timedout" failures are the client script's fault?
Date
Msg-id 2430115.1767994942@sss.pgh.pa.us
Whole thread Raw
In response to Re: Maybe BF "timedout" failures are the client script's fault?  (Michael Banck <mbanck@gmx.net>)
Responses Re: Maybe BF "timedout" failures are the client script's fault?
List pgsql-hackers
Michael Banck <mbanck@gmx.net> writes:
> On Fri, Jan 09, 2026 at 03:41:03PM -0500, Tom Lane wrote:
>> Looking into the buildfarm client, I realized that it's assuming that
>> "sleep($wait_time)" is sufficient to wait for $wait_time seconds.
>> However, the Perl docs point out that sleep() can be interrupted by a
>> signal.  So now I'm suspicious that many of these failures are caused
>> by a stray signal waking up the wait_timeout thread prematurely.

> That might be the case for those other failures, but unfortunately, I
> think the fruitcrow failures are really because it gets stuck endlessly
> in the test_shm_mq test (it is always that one) and only the test
> timeout kicks it out.

If it's always the same test, then yeah that's evidence against
my theory (at least for fruitcrow's failures).

> I've ran that test manually quite a lot and either it finishes in 10-15
> seconds, or (presumably) never. This is not really easy to see in the
> public builfarm logs (at least I can't find it on a quick glance), but
> I've routinely checked the log timestamps of the runs, and they really
> take one hour (wait_timeout) in the case of a hang.

Hmm.  Then why is the BF report showing that the total runtime is
nowhere near that?  I wonder how those times are gathered ...

            regards, tom lane



pgsql-hackers by date:

Previous
From: Michael Banck
Date:
Subject: Re: Maybe BF "timedout" failures are the client script's fault?
Next
From: Michael Banck
Date:
Subject: Re: Maybe BF "timedout" failures are the client script's fault?