Re: Race-condition with failed block-write? - Mailing list pgsql-bugs

From Tom Lane
Subject Re: Race-condition with failed block-write?
Date
Msg-id 26701.1126645297@sss.pgh.pa.us
Whole thread Raw
In response to Re: Race-condition with failed block-write?  (Arjen van der Meijden <acm@tweakers.net>)
Responses Re: Race-condition with failed block-write?
List pgsql-bugs
Arjen van der Meijden <acm@tweakers.net> writes:
> These are all the lines there are for Sep 1:

> [ - 2005-09-01 12:36:50 CEST @] LOG:  received fast shutdown request
> [ - 2005-09-01 12:36:50 CEST @] LOG:  shutting down
> [ - 2005-09-01 12:36:50 CEST @] LOG:  database system is shut down
> [ - 2005-09-01 12:37:01 CEST @] LOG:  received smart shutdown request
> [ - 2005-09-01 12:37:01 CEST @] LOG:  shutting down
> [ - 2005-09-01 12:37:01 CEST @] LOG:  database system is shut down

That's all?  There's something awfully suspicious about that.  You're
sure this is 8.0.3?  AFAICS it is absolutely impossible for the 8.0
postmaster.c code to emit "received smart shutdown request" after
emitting "received fast shutdown request".  The SIGINT code looks like

            if (Shutdown >= FastShutdown)
                break;
            Shutdown = FastShutdown;
            ereport(LOG,
                    (errmsg("received fast shutdown request")));

and the SIGTERM code looks like

            if (Shutdown >= SmartShutdown)
                break;
            Shutdown = SmartShutdown;
            ereport(LOG,
                    (errmsg("received smart shutdown request")));

and there are no other places that change the value of Shutdown, and
certainly FastShutdown > SmartShutdown.  So I wonder if something got
lost in the log entries.

Another question is why the postmaster didn't exit at 12:36:50.  It was
not waiting on any backends, else it would not have launched the
shutdown process (which is what emits the other two messages).

[ thinks for a bit ... ]  I wonder if Shutdown ought to be marked
volatile, since it is after all changed by a signal handler.  But given
the way the postmaster is coded, this doesn't seem likely to be an issue.
Basically all of the code runs with signals blocked.

Can you try to reconstruct what you did on Sep 1, and see whether you
can reproduce the above behavior?

            regards, tom lane

pgsql-bugs by date:

Previous
From: Arjen van der Meijden
Date:
Subject: Re: Race-condition with failed block-write?
Next
From: Arjen van der Meijden
Date:
Subject: Re: Race-condition with failed block-write?