Re: buildfarm instance bichir stuck - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: buildfarm instance bichir stuck
Date
Msg-id 67001287-d881-5803-1c45-a959cb2a101b@dunslane.net
Whole thread Raw
In response to Re: buildfarm instance bichir stuck  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: buildfarm instance bichir stuck  (Robins Tharakan <tharakan@gmail.com>)
List pgsql-hackers
On 4/7/21 2:16 AM, Thomas Munro wrote:
> On Wed, Apr 7, 2021 at 5:44 PM Robins Tharakan <tharakan@gmail.com> wrote:
>> Bichir's been stuck for the past month and is unable to run regression tests since
6a2a70a02018d6362f9841cc2f499cc45405e86b.
> Hrmph.  That's "Use signalfd(2) for epoll latches."  I had a similar
> report from an illumos user (but it was intermittent).  I have never
> seen such a failure on Linux.  My first guess is that these two
> systems that are doing Linux system call emulation have implemented
> subtly different semantics, and something is going wrong like this: a
> SIGUSR1 arrives to tell you some important news about a procsignal and
> the signal handler calls SetLatch(MyLatch) which does kill(MyProcPid,
> SIGURG), but somehow that fails to wake up the epoll() you are
> sleeping in which contains the signalfd that should receive the signal
> and report it by being readable, due to some internal race.  Or
> something like that.  But I haven't been able to verify that theory
> because I don't have any of those computers.  If it is indeed
> something like that and not a bug in my code, then I was thinking that
> the main tool available to deal with it would be to set WAIT_USE_POLL
> in the relevant template file, so that we don't use the combination of
> epoll + signalfd on illlumos, but then WSL1 thows a spanner in the
> works because AFAIK it's masquerading as Ubuntu, running PostgreSQL
> from an Ubuntu package with a freaky kernel.  Hmm.
>

To test this the OP could just add


    CPPFLAGS => '-DWAIT_USE_POLL',


to his animal's config's config_env stanza.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Wired if-statement in gen_partprune_steps_internal
Next
From: Michael Banck
Date:
Subject: Re: [PATCH] New default role allowing to change per-role/database settings