Re: Strange failure on mamba - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Strange failure on mamba
Date
Msg-id 1473980.1669854969@sss.pgh.pa.us
Whole thread Raw
In response to Re: Strange failure on mamba  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2022-11-30 18:33:06 -0500, Tom Lane wrote:
>> Even if somebody comes up with a rewrite to avoid doing interesting stuff in
>> the postmaster's signal handlers, we surely wouldn't risk back-patching it.

> Would that actually fix anything, given netbsd's brokenness? If we used a
> latch like mechanism, the signal handler would still use functions in libc. So
> postmaster could deadlock, at least during the first execution of a signal
> handler? So I think 8acd8f869 continues to be important...

I agree that "-z now" is a good idea for performance reasons, but
what we're seeing is that it's only a partial fix for netbsd's issue,
since it doesn't apply to shared libraries that the postmaster pulls
in.

I'm not sure about your thesis that things are fundamentally broken.
It does seem like if a signal handler does SetLatch then that could
require PLT resolution, and if it interrupts something else doing
PLT resolution then we have a problem.  But if it were a live
problem then we'd have seen instances outside of the postmaster's
select() wait, and we haven't.

I'm kind of inclined to band-aid that select() call as previously
suggested, and see where we end up.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Strange failure on mamba
Next
From: David Rowley
Date:
Subject: Re: Allow round() function to accept float and double precision