Re: backend hangs at immediate shutdown - Mailing list pgsql-hackers

From MauMau
Subject Re: backend hangs at immediate shutdown
Date
Msg-id 611D451C5FB14A86889EA631B1D5B885@maumau
Whole thread Raw
In response to Re: backend hangs at immediate shutdown  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
From: "Tom Lane" <tgl@sss.pgh.pa.us>
> "MauMau" <maumau307@gmail.com> writes:
>> How about the case where some backend crashes due to a bug of PostgreSQL?
>> In this case, postmaster sends SIGQUIT to all backends, too.  The 
>> instance
>> is expected to disappear cleanly and quickly.  Doesn't the hanging 
>> backend
>> harm the restart of the instance?
>
> [ shrug... ]  That isn't guaranteed, and never has been --- for
> instance, the process might have SIGQUIT blocked, perhaps as a result
> of third-party code we have no control over.

Are you concerned about user-defined C functions?  I don't think they need 
to block signals.  So I don't find it too restrictive to say "do not block 
or send signals in user-defined functions."  If it's a real concern, it 
should be noted in the manul, rather than writing "do not use pg_ctl 
stop -mi as much as you can, because it can leave hanging backends."

>> How about using SIGKILL instead of SIGQUIT?
>
> Because then we couldn't notify clients at all.  One practical
> disadvantage of that is that it would become quite hard to tell from
> the outside which client session actually crashed, which is frequently
> useful to know.

How is the message below useful to determine which client session actually 
crashed?  The message doesn't contain information about the crashed session. 
Are you talking about log_line_prefix?

ERROR:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the 
current transaction and exit, because another server process exited 
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and 
repeat your command.

However, it is not quickdie() but LogChildExit() that emits useful 
information to tell which session crashed.  So I don't think quickdie()'s 
message is very helpful.


> I think if we want to make it bulletproof we'd have to do what the
> OP suggested and switch to SIGKILL.  I'm not enamored of that for the
> reasons I mentioned --- but one idea that might dodge the disadvantages
> is to have the postmaster wait a few seconds and then SIGKILL any
> backends that hadn't exited.

I believe that SIGKILL is the only and simple way to choose.  Consider 
again: the purpose of "pg_ctl stop -mi" is to immediately and reliably shut 
down the instance.  If it is not reliable, what can we do instead?


Regards
MauMau




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: [PATCH] HOT on tables with oid indexes broken
Next
From: Amit Kapila
Date:
Subject: Re: Performance Improvement by reducing WAL for Update Operation