Thread: In 8.2, shutdown wrongly caused automatic restart

In 8.2, shutdown wrongly caused automatic restart

From
Fujii Masao
Date:
Hi,

In my customer's system using v8.2.5, even though no process exited abnormally,
the fast shutdown wrongly caused the automatic restart of a database. The server
log is as follows:

Jul 18 16:21:51  postgres[26624]: [1-1] LOG:  received fast shutdown request
Jul 18 16:21:51  postgres[26624]: [2-1] LOG:  aborting any active transactions
Jul 18 16:21:51  postgres[3475]: [1-1] FATAL:  terminating connection
due to administrator command
Jul 18 16:21:51  postgres[6978]: [1-1] FATAL:  terminating connection
due to administrator command
Jul 18 16:21:51  postgres[30868]: [1-1] FATAL:  terminating connection
due to administrator command
Jul 18 16:21:51  postgres[15980]: [1-1] FATAL:  terminating connection
due to administrator command
Jul 18 16:21:51  postgres[27501]: [1-1] FATAL:  terminating connection
due to administrator command
Jul 18 16:21:51  postgres[24341]: [1-1] FATAL:  terminating connection
due to administrator command
Jul 18 16:21:51  postgres[28270]: [1-1] LOG:  shutting down
Jul 18 16:21:51  postgres[28270]: [2-1] LOG:  database system is shut down
Jul 18 16:21:53  postgres[11776]: [3-1] FATAL:  the database system is
shutting down
Jul 18 16:21:53  postgres[11779]: [3-1] FATAL:  the database system is
shutting down
Jul 18 16:21:53  postgres[11780]: [3-1] FATAL:  the database system is
shutting down
Jul 18 16:21:53  postgres[11781]: [3-1] FATAL:  the database system is
shutting down
Jul 18 16:21:53  postgres[26624]: [3-1] LOG:  background writer
process (PID 28270) exited with exit code 0
Jul 18 16:22:01  postgres[26624]: [4-1] LOG:  terminating any other
active server processes
Jul 18 16:22:04  postgres[26624]: [5-1] LOG:  all server processes
terminated; reinitializing
Jul 18 16:22:07  postgres[11801]: [6-1] LOG:  database system was shut
down at 2010-07-18 16:21:51 JST


As far as I read the source code, this problem seems to have happened
as follows.

1. After postmaster received SIGINT, it forcibly terminated all the backends.
2. After postmaster confirmed that no backend is running, it sent SIGUSR2 to
   the bgwriter.
3. After the bgwriter received SIGUSR2, it started the shutdown checkpoint.
4. Postmaster accepted new connection from a client and forked new backend.
5. The bgwriter completed the checkpoint and ended normally before the backend
   was rejected due to database state.
6. Now, the bgwriter had already ended even though there was a backend in
   progress. Postmaster regarded this situation as abnormal and caused the
   recovery.

In 8.3 or later, since postmaster doesn't regard that situation as abnormal
and just waits for all backends to exit again, the problem doesn't happen.
I think that this is a bug in 8.2 and should be fixed in the same way as 8.3
does. Thought?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: In 8.2, shutdown wrongly caused automatic restart

From
Tom Lane
Date:
Fujii Masao <masao.fujii@gmail.com> writes:
> 6. ... the bgwriter had already ended even though there was a backend in
>    progress. Postmaster regarded this situation as abnormal and caused the
>    recovery.

> In 8.3 or later, since postmaster doesn't regard that situation as abnormal
> and just waits for all backends to exit again, the problem doesn't happen.
> I think that this is a bug in 8.2 and should be fixed in the same way as 8.3
> does. Thought?

My recollection is that that change was associated with some pretty
significant revisions to the postmaster state machine.  I'm concerned
about the risks involved in back-patching that.  This seems to be a
corner case with pretty minimal consequences anyway, so I'm inclined
to leave 8.2 alone.

Now, if I'm wrong about that and you can produce a simple and obviously
correct patch for 8.2, go ahead.

            regards, tom lane

Re: In 8.2, shutdown wrongly caused automatic restart

From
Alvaro Herrera
Date:
Excerpts from Tom Lane's message of mié ago 04 12:37:23 -0400 2010:
> Fujii Masao <masao.fujii@gmail.com> writes:
> > 6. ... the bgwriter had already ended even though there was a backend in
> >    progress. Postmaster regarded this situation as abnormal and caused the
> >    recovery.
>
> > In 8.3 or later, since postmaster doesn't regard that situation as abnormal
> > and just waits for all backends to exit again, the problem doesn't happen.
> > I think that this is a bug in 8.2 and should be fixed in the same way as 8.3
> > does. Thought?
>
> My recollection is that that change was associated with some pretty
> significant revisions to the postmaster state machine.  I'm concerned
> about the risks involved in back-patching that.  This seems to be a
> corner case with pretty minimal consequences anyway, so I'm inclined
> to leave 8.2 alone.

IIRC this is the kind of thing that "dead-end backends" were invented
for.  It was too a large patch for backpatching, IMHO.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: In 8.2, shutdown wrongly caused automatic restart

From
Fujii Masao
Date:
On Thu, Aug 5, 2010 at 1:45 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from Tom Lane's message of mi=E9 ago 04 12:37:23 -0400 2010:
>> My recollection is that that change was associated with some pretty
>> significant revisions to the postmaster state machine. =A0I'm concerned
>> about the risks involved in back-patching that. =A0This seems to be a
>> corner case with pretty minimal consequences anyway, so I'm inclined
>> to leave 8.2 alone.
>
> IIRC this is the kind of thing that "dead-end backends" were invented
> for. =A0It was too a large patch for backpatching, IMHO.

Though I thought about this issue for a while, I end up agreeing that
the back-patching has a risk.

Regards,

--=20
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center