Re: Cygwin cleanup - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Cygwin cleanup
Date
Msg-id CA+hUKGK2GYxgeCymCXPZU93GEOEv_wj548SaUvC+xoCXJ8+e3w@mail.gmail.com
Whole thread Raw
In response to Re: Cygwin cleanup  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: Cygwin cleanup
Re: Cygwin cleanup
List pgsql-hackers
On Wed, Jul 27, 2022 at 6:44 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> On Tue, Jul 26, 2022 at 04:24:25PM +1200, Thomas Munro wrote:
> > 3.  You can't really run PostgreSQL on Cygwin for real, because its
> > implementation of signals does not have reliable signal masking, so
> > unsubtle and probably also subtle breakage occurs.  That was reported
> > upstream by Noah years ago, but they aren't working on a fix.
> > lorikeet shows random failures, and presumably any CI system will do
> > the same...
>
> Reference: https://www.postgresql.org/message-id/20170321034703.GB2097809%40tornado.leadboat.com
>
> On my 2nd try:
>
> https://cirrus-ci.com/task/5311911574110208
> TRAP: FailedAssertion("mq->mq_sender == NULL", File: "shm_mq.c", Line: 230, PID: 16370)
> 2022-07-26 06:32:35.525 PDT [15538][postmaster] LOG:  background worker "parallel worker" (PID 16370) was terminated
bysignal 6: Aborted
 

Thanks for working on this!

Huh, that Cygwin being shipped by Choco is quite old, older than
lorikeet's, but not old enough to not have the bug:

[04:33:55.234] Starting cygwin install, version 2.918

Based on clues in Noah's emails in the archives, I think versions from
maybe somewhere around 2015 didn't have the bug, and then the bug
appeared, and AFAIK it's still here.  I wonder if you can tell Choco
to install an ancient version, but even if that's possible you'd be
dealing with other stupid problems and bugs.

> > XXX Doesn't get all the way through yet...
>
> Mainly because getopt was causing all tap tests to fail.
> I tried to fix that in configure, but ended up changing the callers.
>
> This is getting close, but I don't think has actually managed to pass all tests
> yet..  https://cirrus-ci.com/task/5274721116749824

Woo.

> > 4.  When building with Cygwin GCC 11.3 you get a bunch of warnings
> > that don't show up on other platforms, seemingly indicating that it
> > interprets -Wimplicit-fallthrough=3 differently.  Huh?
>
> Evidently due to the same getopt issues.

Ahh, nice detective work.

> > XXX This should use a canned Docker image with all the right packages
> > installed
>
> Has anyone tried using non-canned images ?  It sounds like this could reduce
> the 4min startup time for windows.
>
> https://cirrus-ci.org/guide/docker-builder-vm/#dockerfile-as-a-ci-environment

Yeah, I had that working once.  Not sure what the pros and cons would be for us.

> > XXX configure is soooo slooow, can we cache it?!  Compiling is also
> > insanely slow, but ccache gets it down to a couple of minutes if you're
> > lucky
>
> One reason compiling was slow is because you ended up with -O2.

Ah, right.

> You can cache configure as long as you're willing to re-run it whenever options
> were changed.  That also applies to the existing headerscheck.
>
> > XXX I don't know how to put variables like BUILD_JOBS into the scripts
>
> WDYM ?  If it's outside of bash and in windows shell it's like %var%, right ?
> https://cirrus-ci.org/guide/writing-tasks/#environment-variables

Right.  I should have taken the clue from the %cd% (I got a few ideas
about how to do this from libarchive's CI scripting[1]).

> I just noticed that cirrus is misbehaving: if there's a variable called CI
> (which there is), then it expands $CI_FOO like ${CI}_FOO rather than ${CI_FOO}.
> I've also seen weirdness when variable names or operators appear in the commit
> message...
>
> > XXX Needs some --with-X options
>
> Done

Neat.

> > XXX We would never want this to run by default in CI, but it'd be nice
> > to be able to ask for it with ci-os-only!  (See commented out line)
> >  only_if: $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*cygwin.*'
>
> Doesn't this already do what's needed?
> As long as it doesn't also check: CHANGE_MESSAGE !~ 'ci-os-only',
> the task will runs only on request.

Yeah I was just trying to say that I was sharing the script in a way
that always runs, but for commit we'd want that.  This is all far too
slow for cfbot to have to deal with on every build.  Looks like we can
expect to be able to build and test fast on Windows soonish, though,
so maybe one day we'd just turn Cygwin and MSYS on?

[1] https://github.com/libarchive/libarchive/blob/master/build/ci/cirrus_ci/ci.cmd



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: standby recovery fails (tablespace related) (tentative patch and discussion)
Next
From: Justin Pryzby
Date:
Subject: Re: Cygwin cleanup