Re: [HACKERS] logical replication launcher crash on buildfarm - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [HACKERS] logical replication launcher crash on buildfarm
Date
Msg-id 20170316084423.whlkxjtm735tqjgu@alap3.anarazel.de
Whole thread Raw
In response to Re: [HACKERS] logical replication launcher crash on buildfarm  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Responses Re: [HACKERS] logical replication launcher crash on buildfarm  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
List pgsql-hackers
On 2017-03-16 09:40:48 +0100, Petr Jelinek wrote:
> On 16/03/17 04:42, Andres Freund wrote:
> > On 2017-03-15 20:28:33 -0700, Andres Freund wrote:
> >> Hi,
> >>
> >> I just unstuck a bunch of my buildfarm animals.  That triggered some
> >> spurious failures (on piculet, calliphoridae, mylodon), but also one
> >> that doesn't really look like that:
> >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2002%3A40%3A03
> >>
> >> with the pertinent point being:
> >>
> >> ================== stack trace: pgsql.build/src/test/regress/tmp_check/data/core ==================
> >> [New LWP 1894]
> >> [Thread debugging using libthread_db enabled]
> >> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> >> Core was generated by `postgres: bgworker: logical replication launcher                '.
> >> Program terminated with signal SIGSEGV, Segmentation fault.
> >> #0  0x000055e265bff5e3 in ?? ()
> >> #0  0x000055e265bff5e3 in ?? ()
> >> #1  0x000055d3ccabed0d in StartBackgroundWorker () at
/home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:792
> >> #2  0x000055d3ccacf4fc in SubPostmasterMain (argc=3, argv=0x55d3cdbb71c0) at
/home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4878
> >> #3  0x000055d3cca443ea in main (argc=3, argv=0x55d3cdbb71c0) at
/home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:205
> >>
> >> it's possible that me killing things and upgrading caused this, but
> >> given this is a backend running EXEC_BACKEND, I'm a bit suspicous that
> >> it's more than that.  The machine is a bit backed up at the moment, so
> >> it'll probably be a while till it's at that animal/branch again,
> >> otherwise I'd not have mentioned this.
> > 
> > For some reason it ran again pretty soon. And I'm afraid it's indeed an
> > issue:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2003%3A30%3A02
> > 
> 
> Hmm, I tried with EXEC_BACKEND (and with --disable-spinlocks) and it
> seems to work fine on my two machines. I don't see anything else
> different on culicidae though. Sadly the backtrace is not that
> informative either. I'll try to investigate more but it will take time...

I can give you a login to that machine, it doesn't do anything but run
buildfarm animals...  Will have to be my tomorrow however.

(Also need to fix config for older branches that don't work with
the upgraded ssl. This is a really bad situation :()

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Petr Jelinek
Date:
Subject: Re: [HACKERS] logical replication launcher crash on buildfarm
Next
From: Andres Freund
Date:
Subject: [HACKERS] Quals not pushed down into lateral