Re: Need help debugging SIGBUS crashes - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Need help debugging SIGBUS crashes
Date
Msg-id 392255.1773756727@sss.pgh.pa.us
Whole thread Raw
In response to Re: Need help debugging SIGBUS crashes  (Tomas Vondra <tomas@vondra.me>)
Responses Re: Need help debugging SIGBUS crashes
List pgsql-hackers
Tomas Vondra <tomas@vondra.me> writes:
> On 3/17/26 13:17, Peter 'PMc' Much wrote:
>> So I am now quite clueless on how to proceed further, and could
>> really use some educated inspiration. I can not even say if this is
>> a postgres issue or a FreeBSD issue (but it doesn't happen to any
>> other program).

> I agree it's hard to deduce anything from the backtraces with the
> interesting bits optimized out. Rebuilding the OS with -O0 might be an
> overkill, I'd probably start by building just Postgres. That'd at least
> give us some idea what happens there, you could inspect the memory
> context etc.

What I'm seeing is that malloc's internal data structures are already
corrupt during startup of an autovacuum worker.  I think the most
likely theory is that this somehow traces to our old habit of
launching postmaster child processes from a signal handler, something
that violates the spirit and probably the letter of POSIX, and which
we can clearly see was being done here.  But we got rid of that in PG
v16, so if I were Peter my first move would be to upgrade to something
later than 15.x.

Why it was okay in older FreeBSD and not so much in v14, who knows?
But the FreeBSD guys will almost certainly wash their hands of the
matter the moment they see this stack trace.  I don't think there's
a lot of point in digging deeper unless it still reproduces with
a newer Postgres.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Read-only connection mode for AI workflows.
Next
From: Andres Freund
Date:
Subject: Re: pg_stat_io_histogram