Re: [HACKERS] Ack...major(?) bug just found in v6.3.1... - Mailing list pgsql-hackers

From dg@illustra.com (David Gould)
Subject Re: [HACKERS] Ack...major(?) bug just found in v6.3.1...
Date
Msg-id 9804070253.AA17366@hawk.illustra.com
Whole thread Raw
In response to Re: [HACKERS] Ack...major(?) bug just found in v6.3.1...  ("Vadim B. Mikheev" <vadim@sable.krasnoyarsk.su>)
List pgsql-hackers
> The Hermit Hacker wrote:
> >
> > acctng=> vacuum radlog;
> > NOTICE:  BlowawayRelationBuffers(radlog, 3): block 786 is referenced
> > (private 0, last 0, global 53)
>                       ^^^^^^^^^
> I assume that you got some FATAL before vacuum.
>
> We have problems with elog(FATAL): backend just exits (with normal code)
> and postmaster doesn't re-initialize shmem etc though backend could
> have some spinlocks and pinned buffers. This leaves system in unpredictable
> state!
>
> IMO, in elog(FATAL) backend should abort() (just like in ASSERT).

The other way to do this, and it might be a good idea is to review the code
for elogs while holding a spinlock and make sure to release the lock first!

This problem will get worse if we start allowing query cancelation from
the client, and when the spinlock backoff code goes in.

In the long run you have to make all the signal handlers safe. Safe means
they set a flag and the code periodically polls for it to catch whatever
the condition is. Obviously this doesn't work for SEGV etc, but IO and the
timers yes.

-dg

David Gould            dg@illustra.com           510.628.3783 or 510.305.9468
Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
 - Linux. Not because it is free. Because it is better.


pgsql-hackers by date:

Previous
From: "Vadim B. Mikheev"
Date:
Subject: Re: [HACKERS] Ack...major(?) bug just found in v6.3.1...
Next
From: The Hermit Hacker
Date:
Subject: Re: [HACKERS] Developer setup, what works?