Re: intermittent failures in Cygwin from select_parallel tests - Mailing list pgsql-hackers

From Noah Misch
Subject Re: intermittent failures in Cygwin from select_parallel tests
Date
Msg-id 20210622064212.GA1367859@rfd.leadboat.com
Whole thread Raw
In response to Re: intermittent failures in Cygwin from select_parallel tests  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: intermittent failures in Cygwin from select_parallel tests
Re: intermittent failures in Cygwin from select_parallel tests
List pgsql-hackers
On Tue, Jun 22, 2021 at 05:52:03PM +1200, Thomas Munro wrote:
> On Tue, Jun 22, 2021 at 5:21 PM Noah Misch <noah@leadboat.com> wrote:
> > On Thu, Aug 03, 2017 at 10:45:50AM -0400, Robert Haas wrote:
> > > On Wed, Aug 2, 2017 at 11:47 PM, Noah Misch <noah@leadboat.com> wrote:
> > > > postmaster algorithms rely on the PG_SETMASK() calls preventing that.  Without
> > > > such protection, duplicate bgworkers are an understandable result.  I caught
> > > > several other assertions; the PMChildFlags failure is another case of
> > > > duplicate postmaster children:
> > > >
> > > >       6 TRAP: FailedAssertion("!(entry->trans == ((void *)0))", File: "pgstat.c", Line: 871)
> > > >       3 TRAP: FailedAssertion("!(PMSignalState->PMChildFlags[slot] == 1)", File: "pmsignal.c", Line: 229)
> > > >      20 TRAP: FailedAssertion("!(RefCountErrors == 0)", File: "bufmgr.c", Line: 2523)
> > > >      21 TRAP: FailedAssertion("!(vmq->mq_sender == ((void *)0))", File: "shm_mq.c", Line: 221)
> > > >      Also, got a few "select() failed in postmaster: Bad address"
> > > >
> > > > I suspect a Cygwin signals bug.  I'll try to distill a self-contained test
> > > > case for the Cygwin hackers.  The lack of failures on buildfarm member brolga
> > > > argues that older Cygwin is not affected.
> > >
> > > Nice detective work.
> >
> > Thanks.  http://marc.info/?t=150183296400001 has my upstream report.  The
> > Cygwin project lead reproduced this, but a fix remained elusive.
> >
> > I guess we'll ignore weird postmaster-associated lorikeet failures for the
> > foreseeable future.
> 
> While reading a list of recent build farm assertion failures I learned that
> this is still broken in Cygwin 3.2, and eventually found my way back
> to this thread.

Interesting.  Which branch(es) showed you failures?  I had wondered if the
move to sa_mask (commit 9abb2bfc) would effectively end the problem in v13+.
Perhaps the Cygwin bug pokes through even that.  Perhaps the sa_mask
conditionals need to be "#if defined(WIN32) && !defined(__CYGWIN__)" to help
current buildfarm members.

> I was wondering about suggesting some kind of
> official warning, but I guess the manual already covers it with this
> 10 year old notice.  I don't know much about Windows or Cygwin so I'm
> not sure if it needs updating or not, but I would guess that there are
> no longer any such systems?
> 
>   <productname>Cygwin</productname> is not recommended for running a
>   production server, and it should only be used for running on
>   older versions of <productname>Windows</productname> where
>   the native build does not work.

I expect native builds work on all Microsoft-supported Windows versions, so +1
for removing everything after the comma.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Assertion failure in HEAD and 13 after calling COMMIT in a stored proc
Next
From: Thomas Munro
Date:
Subject: Re: intermittent failures in Cygwin from select_parallel tests