Thread: Back-branch update releases coming in a couple weeks

Back-branch update releases coming in a couple weeks

From
Tom Lane
Date:
Since we've fixed a couple of relatively nasty bugs recently, the core
committee has determined that it'd be a good idea to push out PG update
releases soon.  The current plan is to wrap on Monday Feb 4 for public
announcement Thursday Feb 7.  If you're aware of any bug fixes you think
ought to get included, now's the time to get them done ...
        regards, tom lane



Re: Back-branch update releases coming in a couple weeks

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Since we've fixed a couple of relatively nasty bugs recently, the core
> committee has determined that it'd be a good idea to push out PG update
> releases soon.  The current plan is to wrap on Monday Feb 4 for public
> announcement Thursday Feb 7.  If you're aware of any bug fixes you think
> ought to get included, now's the time to get them done ...

Thanks for that.

Should we consider including Simon's fix for an issue which Noah noted
in this thread?:

http://www.postgresql.org/message-id/CA+U5nMKBrqFxyohr=JSDpgxZ6y0nfAdmt=K3hK4Zzfqo1MHSJg@mail.gmail.com

Admittedly, it's been this way for 3+ years, but there's concurrence on
that thread that it's a bug, so I figured I'd mention it.
Thanks,
    Stephen

Re: Back-branch update releases coming in a couple weeks

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Since we've fixed a couple of relatively nasty bugs recently, the core
>> committee has determined that it'd be a good idea to push out PG update
>> releases soon.  The current plan is to wrap on Monday Feb 4 for public
>> announcement Thursday Feb 7.  If you're aware of any bug fixes you think
>> ought to get included, now's the time to get them done ...

> Should we consider including Simon's fix for an issue which Noah noted
> in this thread?:
> http://www.postgresql.org/message-id/CA+U5nMKBrqFxyohr=JSDpgxZ6y0nfAdmt=K3hK4Zzfqo1MHSJg@mail.gmail.com

It sounds like there's something to fix there, but AFAICS the thread is
still arguing about the best fix.  There's time to do it non-hastily.
        regards, tom lane



Re: Back-branch update releases coming in a couple weeks

From
"MauMau"
Date:
From: "Tom Lane" <tgl@sss.pgh.pa.us>
> Since we've fixed a couple of relatively nasty bugs recently, the core
> committee has determined that it'd be a good idea to push out PG update
> releases soon.  The current plan is to wrap on Monday Feb 4 for public
> announcement Thursday Feb 7.  If you're aware of any bug fixes you think
> ought to get included, now's the time to get them done ...

I've just encountered a serious problem, and I really wish it would be fixed 
in the upcoming minor release.  Could you tell me whether this is already 
fixed and will be included?

I'm using synchronous streaming replication with PostgreSQL 9.1.6 on Linux. 
There are two nodes: one is primary and the other is a standby.  When I 
performed failover tests by doing "pg_ctl stop -mi" against the primary 
while some applications are reading/writing the database, the standby 
crashed while it was performing recovery after receiving promote request:

...
LOG:  archive recovery complete
WARNING:  page 506747 of relation base/482272/482304 was uninitialized
PANIC:  WAL contains references to invalid pages
LOG:  startup process (PID 8938) was terminated by signal 6: Aborted
LOG:  terminating any other active server processes
(the log ends here)

The mentioned relation is an index.  The contents of the referred page was 
all zeros when I looked at it with "od -t x $PGDATA/base/482272/482304.3" 
after the crash.  The subsequent pages of the same file had valid-seeming 
contents.

I searched through PostgreSQL mailing lists with "WAL contains references to 
invalid pages", and i found 19 messages.  Some people encountered similar 
problem.  There were some discussions regarding those problems (Tom and 
Simon Riggs commented), but those discussions did not reach a solution.

I also found a discussion which might relate to this problem.  Does this fix 
the problem?

[BUG] lag of minRecoveryPont in archive recovery
http://www.postgresql.org/message-id/20121206.130458.170549097.horiguchi.kyotaro@lab.ntt.co.jp


Regards
MauMau




Re: Back-branch update releases coming in a couple weeks

From
Fujii Masao
Date:
On Thu, Jan 24, 2013 at 7:42 AM, MauMau <maumau307@gmail.com> wrote:
> From: "Tom Lane" <tgl@sss.pgh.pa.us>
>
>> Since we've fixed a couple of relatively nasty bugs recently, the core
>> committee has determined that it'd be a good idea to push out PG update
>> releases soon.  The current plan is to wrap on Monday Feb 4 for public
>> announcement Thursday Feb 7.  If you're aware of any bug fixes you think
>> ought to get included, now's the time to get them done ...
>
>
> I've just encountered a serious problem, and I really wish it would be fixed
> in the upcoming minor release.  Could you tell me whether this is already
> fixed and will be included?
>
> I'm using synchronous streaming replication with PostgreSQL 9.1.6 on Linux.
> There are two nodes: one is primary and the other is a standby.  When I
> performed failover tests by doing "pg_ctl stop -mi" against the primary
> while some applications are reading/writing the database, the standby
> crashed while it was performing recovery after receiving promote request:
>
> ...
> LOG:  archive recovery complete
> WARNING:  page 506747 of relation base/482272/482304 was uninitialized
> PANIC:  WAL contains references to invalid pages
> LOG:  startup process (PID 8938) was terminated by signal 6: Aborted
> LOG:  terminating any other active server processes
> (the log ends here)
>
> The mentioned relation is an index.  The contents of the referred page was
> all zeros when I looked at it with "od -t x $PGDATA/base/482272/482304.3"
> after the crash.  The subsequent pages of the same file had valid-seeming
> contents.
>
> I searched through PostgreSQL mailing lists with "WAL contains references to
> invalid pages", and i found 19 messages.  Some people encountered similar
> problem.  There were some discussions regarding those problems (Tom and
> Simon Riggs commented), but those discussions did not reach a solution.
>
> I also found a discussion which might relate to this problem.  Does this fix
> the problem?
>
> [BUG] lag of minRecoveryPont in archive recovery
> http://www.postgresql.org/message-id/20121206.130458.170549097.horiguchi.kyotaro@lab.ntt.co.jp

Yes. Could you check whether you can reproduce the problem on the
latest REL9_2_STABLE?

Regards,

-- 
Fujii Masao



Re: Back-branch update releases coming in a couple weeks

From
"MauMau"
Date:
From: "Fujii Masao" <masao.fujii@gmail.com>
> On Thu, Jan 24, 2013 at 7:42 AM, MauMau <maumau307@gmail.com> wrote:
>> I searched through PostgreSQL mailing lists with "WAL contains references 
>> to
>> invalid pages", and i found 19 messages.  Some people encountered similar
>> problem.  There were some discussions regarding those problems (Tom and
>> Simon Riggs commented), but those discussions did not reach a solution.
>>
>> I also found a discussion which might relate to this problem.  Does this 
>> fix
>> the problem?
>>
>> [BUG] lag of minRecoveryPont in archive recovery
>> http://www.postgresql.org/message-id/20121206.130458.170549097.horiguchi.kyotaro@lab.ntt.co.jp
>
> Yes. Could you check whether you can reproduce the problem on the
> latest REL9_2_STABLE?

I tried to produce the problem by doing "pg_ctl stop -mi" against the 
primary more than ten times on REL9_2_STABLE, but the problem did not 
appear.  However, I encountered the crash only once out of dozens of 
failovers, possibly more than a hundred times, on PostgreSQL 9.1.6.  So, I'm 
not sure the problem is fixed in REL9_2_STABLE.

I'm wondering if the fix discussed in the above thread solves my problem.  I 
found the following differences between Horiguchi-san's case and my case:

(1)
Horiguchi-san says the bug outputs the message:

WARNING:  page 0 of relation base/16384/16385 does not exist

On the other hand, I got the message:

WARNING:  page 506747 of relation base/482272/482304 was uninitialized


(2)
Horiguchi-san produced the problem when he shut the standby immediately and 
restarted it.  However, I saw the problem during failover.


(3)
Horiguchi-san did not use any index, but in my case the WARNING message 
refers to an index.


But there's a similar point.  Horiguchi-san says the problem occurs after 
DELETE+VACUUM.  In my case, I shut the primary down while the application 
was doing INSERT/UPDATE.  As the below messages show, some vacuuming was 
running just before the immediate shutdown:

...
LOG:  automatic vacuum of table "GOLD.scm1.tbl1": index scans: 0pages: 0 removed, 36743 remaintuples: 0 removed, 73764
remainsystemusage: CPU 0.09s/0.11u sec elapsed 0.66 sec
 
LOG:  automatic analyze of table "GOLD.scm1.tbl1" system usage: CPU 
0.00s/0.14u sec elapsed 0.32 sec
LOG:  automatic vacuum of table "GOLD.scm1.tbl2": index scans: 0pages: 0 removed, 12101 remaintuples: 40657 removed,
44142remain system usage: CPU 0.06s/0.06u sec 
 
elapsed 0.30 sec
LOG:  automatic analyze of table "GOLD.scm1.tbl2" system usage: CPU 
0.00s/0.06u sec elapsed 0.14 sec
LOG:  received immediate shutdown request
...


Could you tell me the details of the problem discussed and fixed in the 
upcoming minor release?  I would to like to know the phenomenon and its 
conditions, and whether it applies to my case.

Regards
MauMau





Re: Back-branch update releases coming in a couple weeks

From
Fujii Masao
Date:
On Thu, Jan 24, 2013 at 11:53 PM, MauMau <maumau307@gmail.com> wrote:
> From: "Fujii Masao" <masao.fujii@gmail.com>
>>
>> On Thu, Jan 24, 2013 at 7:42 AM, MauMau <maumau307@gmail.com> wrote:
>>>
>>> I searched through PostgreSQL mailing lists with "WAL contains references
>>> to
>>> invalid pages", and i found 19 messages.  Some people encountered similar
>>> problem.  There were some discussions regarding those problems (Tom and
>>> Simon Riggs commented), but those discussions did not reach a solution.
>>>
>>> I also found a discussion which might relate to this problem.  Does this
>>> fix
>>> the problem?
>>>
>>> [BUG] lag of minRecoveryPont in archive recovery
>>>
>>> http://www.postgresql.org/message-id/20121206.130458.170549097.horiguchi.kyotaro@lab.ntt.co.jp
>>
>>
>> Yes. Could you check whether you can reproduce the problem on the
>> latest REL9_2_STABLE?
>
>
> I tried to produce the problem by doing "pg_ctl stop -mi" against the
> primary more than ten times on REL9_2_STABLE, but the problem did not
> appear.  However, I encountered the crash only once out of dozens of
> failovers, possibly more than a hundred times, on PostgreSQL 9.1.6.  So, I'm
> not sure the problem is fixed in REL9_2_STABLE.

You can reproduce the problem in REL9_1_STABLE?

Sorry, I pointed wrong version, i.e., REL9_2_STABLE. Since you encountered
the problem in 9.1.6, you need to retry the test in REL9_1_STABLE.

>
> I'm wondering if the fix discussed in the above thread solves my problem.  I
> found the following differences between Horiguchi-san's case and my case:
>
> (1)
> Horiguchi-san says the bug outputs the message:
>
> WARNING:  page 0 of relation base/16384/16385 does not exist
>
> On the other hand, I got the message:
>
>
> WARNING:  page 506747 of relation base/482272/482304 was uninitialized
>
>
> (2)
> Horiguchi-san produced the problem when he shut the standby immediately and
> restarted it.  However, I saw the problem during failover.
>
>
> (3)
> Horiguchi-san did not use any index, but in my case the WARNING message
> refers to an index.
>
>
> But there's a similar point.  Horiguchi-san says the problem occurs after
> DELETE+VACUUM.  In my case, I shut the primary down while the application
> was doing INSERT/UPDATE.  As the below messages show, some vacuuming was
> running just before the immediate shutdown:
>
> ...
> LOG:  automatic vacuum of table "GOLD.scm1.tbl1": index scans: 0
> pages: 0 removed, 36743 remain
> tuples: 0 removed, 73764 remain
> system usage: CPU 0.09s/0.11u sec elapsed 0.66 sec
> LOG:  automatic analyze of table "GOLD.scm1.tbl1" system usage: CPU
> 0.00s/0.14u sec elapsed 0.32 sec
> LOG:  automatic vacuum of table "GOLD.scm1.tbl2": index scans: 0
> pages: 0 removed, 12101 remain
> tuples: 40657 removed, 44142 remain system usage: CPU 0.06s/0.06u sec
> elapsed 0.30 sec
> LOG:  automatic analyze of table "GOLD.scm1.tbl2" system usage: CPU
> 0.00s/0.06u sec elapsed 0.14 sec
> LOG:  received immediate shutdown request
> ...
>
>
> Could you tell me the details of the problem discussed and fixed in the
> upcoming minor release?  I would to like to know the phenomenon and its
> conditions, and whether it applies to my case.

http://www.postgresql.org/message-id/20121206.130458.170549097.horiguchi.kyotaro@lab.ntt.co.jp

Could you read the discussion in the above thread?

Regards,

-- 
Fujii Masao



Re: Back-branch update releases coming in a couple weeks

From
"MauMau"
Date:
From: "Fujii Masao" <masao.fujii@gmail.com>
> On Thu, Jan 24, 2013 at 11:53 PM, MauMau <maumau307@gmail.com> wrote:
>> I'm wondering if the fix discussed in the above thread solves my problem. 
>> I
>> found the following differences between Horiguchi-san's case and my case:
>>
>> (1)
>> Horiguchi-san says the bug outputs the message:
>>
>> WARNING:  page 0 of relation base/16384/16385 does not exist
>>
>> On the other hand, I got the message:
>>
>>
>> WARNING:  page 506747 of relation base/482272/482304 was uninitialized
>>
>>
>> (2)
>> Horiguchi-san produced the problem when he shut the standby immediately 
>> and
>> restarted it.  However, I saw the problem during failover.
>>
>>
>> (3)
>> Horiguchi-san did not use any index, but in my case the WARNING message
>> refers to an index.
>>
>>
>> But there's a similar point.  Horiguchi-san says the problem occurs after
>> DELETE+VACUUM.  In my case, I shut the primary down while the application
>> was doing INSERT/UPDATE.  As the below messages show, some vacuuming was
>> running just before the immediate shutdown:
>>
>> ...
>> LOG:  automatic vacuum of table "GOLD.scm1.tbl1": index scans: 0
>> pages: 0 removed, 36743 remain
>> tuples: 0 removed, 73764 remain
>> system usage: CPU 0.09s/0.11u sec elapsed 0.66 sec
>> LOG:  automatic analyze of table "GOLD.scm1.tbl1" system usage: CPU
>> 0.00s/0.14u sec elapsed 0.32 sec
>> LOG:  automatic vacuum of table "GOLD.scm1.tbl2": index scans: 0
>> pages: 0 removed, 12101 remain
>> tuples: 40657 removed, 44142 remain system usage: CPU 0.06s/0.06u sec
>> elapsed 0.30 sec
>> LOG:  automatic analyze of table "GOLD.scm1.tbl2" system usage: CPU
>> 0.00s/0.06u sec elapsed 0.14 sec
>> LOG:  received immediate shutdown request
>> ...
>>
>>
>> Could you tell me the details of the problem discussed and fixed in the
>> upcoming minor release?  I would to like to know the phenomenon and its
>> conditions, and whether it applies to my case.
>
> http://www.postgresql.org/message-id/20121206.130458.170549097.horiguchi.kyotaro@lab.ntt.co.jp
>
> Could you read the discussion in the above thread?

Yes, I've just read the discussion (it was difficult for me...)

Although you said the fix will solve my problem, I don't feel it will.  The 
discussion is about the crash when the standby "re"starts after the primary 
vacuums and truncates a table.  On the other hand, in my case, the standby 
crashed during failover (not at restart), emitting a message that some WAL 
record refers to an "uninitialized" page (not a non-existent page) of an 
"index" (not a table).

In addition, fujii_test.sh did not reproduce the mentioned crash on 
PostgreSQL 9.1.6.

I'm sorry to cause you trouble, but could you elaborate on how the fix 
relates to my case?

Regards
MauMau




Re: Back-branch update releases coming in a couple weeks

From
Fujii Masao
Date:
On Sun, Jan 27, 2013 at 12:17 AM, MauMau <maumau307@gmail.com> wrote:
> From: "Fujii Masao" <masao.fujii@gmail.com>
>>
>> On Thu, Jan 24, 2013 at 11:53 PM, MauMau <maumau307@gmail.com> wrote:
>>>
>>> I'm wondering if the fix discussed in the above thread solves my problem.
>>> I
>>> found the following differences between Horiguchi-san's case and my case:
>>>
>>> (1)
>>> Horiguchi-san says the bug outputs the message:
>>>
>>> WARNING:  page 0 of relation base/16384/16385 does not exist
>>>
>>> On the other hand, I got the message:
>>>
>>>
>>> WARNING:  page 506747 of relation base/482272/482304 was uninitialized
>>>
>>>
>>> (2)
>>> Horiguchi-san produced the problem when he shut the standby immediately
>>> and
>>> restarted it.  However, I saw the problem during failover.
>>>
>>>
>>> (3)
>>> Horiguchi-san did not use any index, but in my case the WARNING message
>>> refers to an index.
>>>
>>>
>>> But there's a similar point.  Horiguchi-san says the problem occurs after
>>> DELETE+VACUUM.  In my case, I shut the primary down while the application
>>> was doing INSERT/UPDATE.  As the below messages show, some vacuuming was
>>> running just before the immediate shutdown:
>>>
>>> ...
>>> LOG:  automatic vacuum of table "GOLD.scm1.tbl1": index scans: 0
>>> pages: 0 removed, 36743 remain
>>> tuples: 0 removed, 73764 remain
>>> system usage: CPU 0.09s/0.11u sec elapsed 0.66 sec
>>> LOG:  automatic analyze of table "GOLD.scm1.tbl1" system usage: CPU
>>> 0.00s/0.14u sec elapsed 0.32 sec
>>> LOG:  automatic vacuum of table "GOLD.scm1.tbl2": index scans: 0
>>> pages: 0 removed, 12101 remain
>>> tuples: 40657 removed, 44142 remain system usage: CPU 0.06s/0.06u sec
>>> elapsed 0.30 sec
>>> LOG:  automatic analyze of table "GOLD.scm1.tbl2" system usage: CPU
>>> 0.00s/0.06u sec elapsed 0.14 sec
>>> LOG:  received immediate shutdown request
>>> ...
>>>
>>>
>>> Could you tell me the details of the problem discussed and fixed in the
>>> upcoming minor release?  I would to like to know the phenomenon and its
>>> conditions, and whether it applies to my case.
>>
>>
>>
>> http://www.postgresql.org/message-id/20121206.130458.170549097.horiguchi.kyotaro@lab.ntt.co.jp
>>
>> Could you read the discussion in the above thread?
>
>
> Yes, I've just read the discussion (it was difficult for me...)
>
> Although you said the fix will solve my problem, I don't feel it will.  The
> discussion is about the crash when the standby "re"starts after the primary
> vacuums and truncates a table.  On the other hand, in my case, the standby
> crashed during failover (not at restart), emitting a message that some WAL
> record refers to an "uninitialized" page (not a non-existent page) of an
> "index" (not a table).
>
> In addition, fujii_test.sh did not reproduce the mentioned crash on
> PostgreSQL 9.1.6.
>
> I'm sorry to cause you trouble, but could you elaborate on how the fix
> relates to my case?

Maybe I had not been understanding your problem correctly.
Could you show the self-contained test case which reproduces the problem?
Is the problem still reproducible in REL9_1_STABLE?

Regards,

-- 
Fujii Masao



Re: Back-branch update releases coming in a couple weeks

From
"MauMau"
Date:
From: "Fujii Masao" <masao.fujii@gmail.com>
> On Sun, Jan 27, 2013 at 12:17 AM, MauMau <maumau307@gmail.com> wrote:
>> Although you said the fix will solve my problem, I don't feel it will. 
>> The
>> discussion is about the crash when the standby "re"starts after the 
>> primary
>> vacuums and truncates a table.  On the other hand, in my case, the 
>> standby
>> crashed during failover (not at restart), emitting a message that some 
>> WAL
>> record refers to an "uninitialized" page (not a non-existent page) of an
>> "index" (not a table).
>>
>> In addition, fujii_test.sh did not reproduce the mentioned crash on
>> PostgreSQL 9.1.6.
>>
>> I'm sorry to cause you trouble, but could you elaborate on how the fix
>> relates to my case?
>
> Maybe I had not been understanding your problem correctly.
> Could you show the self-contained test case which reproduces the problem?
> Is the problem still reproducible in REL9_1_STABLE?

As I said before, it's very hard to reproduce the problem.  All what I did 
is to repeat the following sequence:

1. run "pg_ctl stop -mi" against the primary while the applications were 
performing INSERT/UPDATE/SELECT.
2. run "pg_ctl promote" against the standby of synchronous streaming 
standby.
3. run pg_basebackup on the stopped (original) primary to create a new 
standby, and start the new standby.

I did this failover test dozens of times, probably more than a hundred.  And 
I encountered the crash only once.

Although I saw the problem only once, the result is catastrophic.  So, I 
really wish Heiki's patch (in cooperation with Horiguchi-san and you) could 
fix the issue.

Do you think of anything?

Regards
MauMau





Re: Back-branch update releases coming in a couple weeks

From
Fujii Masao
Date:
On Sun, Jan 27, 2013 at 11:38 PM, MauMau <maumau307@gmail.com> wrote:
> From: "Fujii Masao" <masao.fujii@gmail.com>
>>
>> On Sun, Jan 27, 2013 at 12:17 AM, MauMau <maumau307@gmail.com> wrote:
>>>
>>> Although you said the fix will solve my problem, I don't feel it will.
>>> The
>>> discussion is about the crash when the standby "re"starts after the
>>> primary
>>> vacuums and truncates a table.  On the other hand, in my case, the
>>> standby
>>> crashed during failover (not at restart), emitting a message that some
>>> WAL
>>> record refers to an "uninitialized" page (not a non-existent page) of an
>>> "index" (not a table).
>>>
>>> In addition, fujii_test.sh did not reproduce the mentioned crash on
>>> PostgreSQL 9.1.6.
>>>
>>> I'm sorry to cause you trouble, but could you elaborate on how the fix
>>> relates to my case?
>>
>>
>> Maybe I had not been understanding your problem correctly.
>> Could you show the self-contained test case which reproduces the problem?
>> Is the problem still reproducible in REL9_1_STABLE?
>
>
> As I said before, it's very hard to reproduce the problem.  All what I did
> is to repeat the following sequence:
>
> 1. run "pg_ctl stop -mi" against the primary while the applications were
> performing INSERT/UPDATE/SELECT.
> 2. run "pg_ctl promote" against the standby of synchronous streaming
> standby.
> 3. run pg_basebackup on the stopped (original) primary to create a new
> standby, and start the new standby.
>
> I did this failover test dozens of times, probably more than a hundred.  And
> I encountered the crash only once.
>
> Although I saw the problem only once, the result is catastrophic.  So, I
> really wish Heiki's patch (in cooperation with Horiguchi-san and you) could
> fix the issue.
>
> Do you think of anything?

Umm... it's hard to tell whether your problem has been fixed in the latest
9.1, from that information. The bug fix which you mentioned consists of
two patches.

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=7bffc9b7bf9e09ddeddc65117e49829f758e500d
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=970fb12de121941939e777764d6e0446c974bba3

The former seems not to be related to your problem because the problem
that patch fixed could basically happen only when restarting the standby.
The latter might be related to your problem....

Regards,

-- 
Fujii Masao



From: "Tom Lane" <tgl@sss.pgh.pa.us>
> Since we've fixed a couple of relatively nasty bugs recently, the core
> committee has determined that it'd be a good idea to push out PG update
> releases soon.  The current plan is to wrap on Monday Feb 4 for public
> announcement Thursday Feb 7.  If you're aware of any bug fixes you think
> ought to get included, now's the time to get them done ...

I've just encountered another serious bug, which I wish to be fixed in the 
upcoming minor release.

I'm using streaming replication with PostgreSQL 9.1.6 on Linux (RHEL6.2, 
kernel 2.6.32).  But this problem should happen regardless of the use of 
streaming replication.

When I ran "pg_ctl stop -mi" against the primary, some applications 
connected to the primary did not stop.  The cause was that the backends was 
deadlocked in quickdie() with some call stack like the following.  I'm sorry 
to have left the stack trace file on the testing machine, so I'll show you 
the precise stack trace tomorrow.

some lock function
malloc()
gettext()
errhint()
quickdie()
<signal handler called because of SIGQUIT>
free()
...
PostgresMain()
...

The root cause is that gettext() is called in the signal handler quickdie() 
via errhint().  As you know, malloc() cannot be called in a signal handler:

http://www.gnu.org/software/libc/manual/html_node/Nonreentrancy.html#Nonreentrancy

[Excerpt]
On most systems, malloc and free are not reentrant, because they use a 
static data structure which records what memory blocks are free. As a 
result, no library functions that allocate or free memory are reentrant. 
This includes functions that allocate space to store a result.


And gettext() calls malloc(), as reported below:

http://lists.gnu.org/archive/html/bug-coreutils/2005-04/msg00056.html

I think the solution is the typical one.  That is, to just remember the 
receipt of SIGQUIT by setting a global variable and call siglongjmp() in 
quickdie(), and perform tasks currently done in quickdie() when sigsetjmp() 
returns in PostgresMain().

What do think about the solution?  Could you include the fix?  If it's okay 
and you want, I'll submit the patch.

Regards
MauMau




"MauMau" <maumau307@gmail.com> writes:
> When I ran "pg_ctl stop -mi" against the primary, some applications 
> connected to the primary did not stop. ...
> The root cause is that gettext() is called in the signal handler quickdie() 
> via errhint().

Yeah, it's a known hazard that quickdie() operates like that.

> I think the solution is the typical one.  That is, to just remember the 
> receipt of SIGQUIT by setting a global variable and call siglongjmp() in 
> quickdie(), and perform tasks currently done in quickdie() when sigsetjmp() 
> returns in PostgresMain().

I think this cure is considerably worse than the disease.  As stated,
it's not a fix at all: longjmp'ing out of a signal handler is no better
defined than what happens now, in fact it's probably even less safe.
We could just set a flag and wait for the mainline code to notice,
but that would make SIGQUIT hardly any stronger than SIGTERM --- in
particular it couldn't get you out of any loop that wasn't checking for
interrupts.

The long and the short of it is that SIGQUIT is the emergency-stop panic
button.  You don't use it for routine shutdowns --- you use it when
there is a damn good reason to and you're prepared to do some manual
cleanup if necessary.

http://en.wikipedia.org/wiki/Big_Red_Switch

> What do think about the solution?  Could you include the fix?

Even if we had an arguably-better solution, I'd be disinclined to
risk cramming it into stable branches on such short notice.

What might make sense on short notice is to strengthen the
documentation's cautions against using SIGQUIT unnecessarily.
        regards, tom lane



On 2013-01-30 10:23:09 -0500, Tom Lane wrote:
> "MauMau" <maumau307@gmail.com> writes:
> > When I ran "pg_ctl stop -mi" against the primary, some applications 
> > connected to the primary did not stop. ...
> > The root cause is that gettext() is called in the signal handler quickdie() 
> > via errhint().
> 
> Yeah, it's a known hazard that quickdie() operates like that.

What about not translating those? The messages are static and all memory
needed by postgres should be pre-allocated.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-01-30 10:23:09 -0500, Tom Lane wrote:
>> Yeah, it's a known hazard that quickdie() operates like that.

> What about not translating those? The messages are static and all memory
> needed by postgres should be pre-allocated.

That would reduce our exposure slightly, but hardly to zero.  For
instance, if SIGQUIT happened in the midst of handling a regular error,
ErrorContext might be pretty full already, necessitating further malloc
requests.  I thought myself about suggesting that quickdie do something
to disable gettext(), but it doesn't seem like that would make it enough
safer to justify the loss of user-friendliness for non English speakers.

I think the conflict between "we don't want SIGQUIT to interrupt this"
and "we do want SIGQUIT to interrupt that" is pretty fundamental, and
there's probably not any bulletproof solution (or at least none that
would have reasonable development/maintenance cost).  If we had more
confidence that there were no major loops lacking CHECK_FOR_INTERRUPTS
calls, maybe the set-a-flag approach would be acceptable ... but I
sure don't have such confidence.
        regards, tom lane



From: "Tom Lane" <tgl@sss.pgh.pa.us>
> "MauMau" <maumau307@gmail.com> writes:
>> I think the solution is the typical one.  That is, to just remember the
>> receipt of SIGQUIT by setting a global variable and call siglongjmp() in
>> quickdie(), and perform tasks currently done in quickdie() when 
>> sigsetjmp()
>> returns in PostgresMain().
>
> I think this cure is considerably worse than the disease.  As stated,
> it's not a fix at all: longjmp'ing out of a signal handler is no better
> defined than what happens now, in fact it's probably even less safe.
> We could just set a flag and wait for the mainline code to notice,
> but that would make SIGQUIT hardly any stronger than SIGTERM --- in
> particular it couldn't get you out of any loop that wasn't checking for
> interrupts.

Oh, I was careless.  You are right, my suggestion is not a fix at all 
because free() would continue to hold some lock after siglongjmp(), which 
malloc() tries to acquire.

> The long and the short of it is that SIGQUIT is the emergency-stop panic
> button.  You don't use it for routine shutdowns --- you use it when
> there is a damn good reason to and you're prepared to do some manual
> cleanup if necessary.
>
> http://en.wikipedia.org/wiki/Big_Red_Switch

How about the case where some backend crashes due to a bug of PostgreSQL? 
In this case, postmaster sends SIGQUIT to all backends, too.  The instance 
is expected to disappear cleanly and quickly.  Doesn't the hanging backend 
harm the restart of the instance?

How about using SIGKILL instead of SIGQUIT?  The purpose of SIGQUIT is to 
shutdown the processes quickly.  SIGKILL is the best signal for that 
purpose.  The WARNING message would not be sent to clients, but that does 
not justify the inability of immediately shutting down.

Regards
MauMau





"MauMau" <maumau307@gmail.com> writes:
> From: "Tom Lane" <tgl@sss.pgh.pa.us>
>> The long and the short of it is that SIGQUIT is the emergency-stop panic
>> button.  You don't use it for routine shutdowns --- you use it when
>> there is a damn good reason to and you're prepared to do some manual
>> cleanup if necessary.

> How about the case where some backend crashes due to a bug of PostgreSQL? 
> In this case, postmaster sends SIGQUIT to all backends, too.  The instance 
> is expected to disappear cleanly and quickly.  Doesn't the hanging backend 
> harm the restart of the instance?

[ shrug... ]  That isn't guaranteed, and never has been --- for
instance, the process might have SIGQUIT blocked, perhaps as a result
of third-party code we have no control over.

> How about using SIGKILL instead of SIGQUIT?

Because then we couldn't notify clients at all.  One practical
disadvantage of that is that it would become quite hard to tell from
the outside which client session actually crashed, which is frequently
useful to know.

This isn't an area that admits of quick-fix solutions --- everything
we might do has disadvantages.  Also, the lack of complaints to date
shows that the problem is not so large as to justify panic responses.
I'm not really inclined to mess around with a tradeoff that's been
working pretty well for a dozen years or more.
        regards, tom lane



Re: backend hangs at immediate shutdown

From
Tatsuo Ishii
Date:
> This isn't an area that admits of quick-fix solutions --- everything
> we might do has disadvantages.  Also, the lack of complaints to date
> shows that the problem is not so large as to justify panic responses.
> I'm not really inclined to mess around with a tradeoff that's been
> working pretty well for a dozen years or more.

What about adding a caution to the doc something like:
 "pg_ctl -m -i stop" may cause a PostgreSQL hang if native laguage support enabled.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



Re: backend hangs at immediate shutdown

From
Andres Freund
Date:
On 2013-01-31 08:27:13 +0900, Tatsuo Ishii wrote:
> > This isn't an area that admits of quick-fix solutions --- everything
> > we might do has disadvantages.  Also, the lack of complaints to date
> > shows that the problem is not so large as to justify panic responses.
> > I'm not really inclined to mess around with a tradeoff that's been
> > working pretty well for a dozen years or more.
> 
> What about adding a caution to the doc something like:
> 
>      "pg_ctl -m -i stop" may cause a PostgreSQL hang if native laguage support enabled.

That doesn't entirely solve the problem, see quote and reply in
6845.1359561252@sss.pgh.pa.us

I think adding errmsg_raw() or somesuch that doesn't allocate any memory
and only accepts constant strings could solve the problem more
completely, at the obvious price of not allowing translated strings
directly.
Those could be pretranslated during startup, but thats mighty ugly.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: backend hangs at immediate shutdown

From
Tom Lane
Date:
Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-01-31 08:27:13 +0900, Tatsuo Ishii wrote:
>> What about adding a caution to the doc something like:
>> "pg_ctl -m -i stop" may cause a PostgreSQL hang if native laguage support enabled.

> That doesn't entirely solve the problem, see quote and reply in
> 6845.1359561252@sss.pgh.pa.us

> I think adding errmsg_raw() or somesuch that doesn't allocate any memory
> and only accepts constant strings could solve the problem more
> completely, at the obvious price of not allowing translated strings
> directly.

I really doubt that this would make a measurable difference in the
probability of failure.  The OP's case looks like it might not have
occurred if we weren't translating, but (a) that's not actually proven,
and (b) there are any number of other, equally low-probability, reasons
to have a problem here.  Please note for instance that elog.c would
still be doing a whole lot of palloc's even if the passed strings were
not copied.

I think if we want to make it bulletproof we'd have to do what the
OP suggested and switch to SIGKILL.  I'm not enamored of that for the
reasons I mentioned --- but one idea that might dodge the disadvantages
is to have the postmaster wait a few seconds and then SIGKILL any
backends that hadn't exited.
        regards, tom lane



Re: backend hangs at immediate shutdown

From
Tatsuo Ishii
Date:
>> What about adding a caution to the doc something like:
>> 
>>      "pg_ctl -m -i stop" may cause a PostgreSQL hang if native laguage support enabled.
> 
> That doesn't entirely solve the problem, see quote and reply in
> 6845.1359561252@sss.pgh.pa.us

Oh, I see now.

> I think adding errmsg_raw() or somesuch that doesn't allocate any memory
> and only accepts constant strings could solve the problem more
> completely, at the obvious price of not allowing translated strings
> directly.
> Those could be pretranslated during startup, but thats mighty ugly.

Are you suggesting to call errmsg_raw() instead of errmsg() in quickdie()?

Tom said:
> That would reduce our exposure slightly, but hardly to zero.  For
> instance, if SIGQUIT happened in the midst of handling a regular error,
> ErrorContext might be pretty full already, necessitating further malloc
> requests.

If I understand this correctly, I don't think errmsg_raw()
solves the particular problem.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



Re: backend hangs at immediate shutdown

From
"MauMau"
Date:
As I promised yesterday, I'll show you the precise call stack:

#0  0x0000003fa0cf542e in __lll_lock_wait_private () from /lib64/libc.so.6
#1  0x0000003fa0c7bed5 in _L_lock_9323 () from /lib64/libc.so.6
#2  0x0000003fa0c797c6 in malloc () from /lib64/libc.so.6
#3  0x0000003fa0c2fd99 in _nl_make_l10nflist () from /lib64/libc.so.6
#4  0x0000003fa0c2e0a5 in _nl_find_domain () from /lib64/libc.so.6
#5  0x0000003fa0c2d990 in __dcigettext () from /lib64/libc.so.6
#6  0x00000000006f2a71 in errhint ()
#7  0x0000000000634064 in quickdie ()
#8  <signal handler called>
#9  0x0000003fa0c77813 in _int_free () from /lib64/libc.so.6
#10 0x000000000070e329 in AllocSetDelete ()
#11 0x000000000070e8cb in MemoryContextDelete ()
#12 0x0000000000571723 in FreeExprContext ()
#13 0x0000000000571781 in FreeExecutorState ()
#14 0x00000000005dc883 in evaluate_expr ()
#15 0x00000000005ddca0 in simplify_function ()
#16 0x00000000005de69f in eval_const_expressions_mutator ()
#17 0x0000000000599143 in expression_tree_mutator ()
#18 0x00000000005de452 in eval_const_expressions_mutator ()
#19 0x0000000000599143 in expression_tree_mutator ()
#20 0x00000000005de452 in eval_const_expressions_mutator ()
#21 0x00000000005dfa2f in eval_const_expressions ()
#22 0x00000000005cf16d in preprocess_expression ()
#23 0x00000000005d2201 in subquery_planner ()
#24 0x00000000005d23cf in standard_planner ()
#25 0x000000000063426a in pg_plan_query ()
#26 0x0000000000634354 in pg_plan_queries ()
#27 0x0000000000635310 in exec_simple_query ()
#28 0x0000000000636333 in PostgresMain ()
#29 0x00000000005f64e9 in PostmasterMain ()
#30 0x0000000000596e20 in main ()





Re: backend hangs at immediate shutdown

From
"MauMau"
Date:
From: "Tom Lane" <tgl@sss.pgh.pa.us>
> "MauMau" <maumau307@gmail.com> writes:
>> How about the case where some backend crashes due to a bug of PostgreSQL?
>> In this case, postmaster sends SIGQUIT to all backends, too.  The 
>> instance
>> is expected to disappear cleanly and quickly.  Doesn't the hanging 
>> backend
>> harm the restart of the instance?
>
> [ shrug... ]  That isn't guaranteed, and never has been --- for
> instance, the process might have SIGQUIT blocked, perhaps as a result
> of third-party code we have no control over.

Are you concerned about user-defined C functions?  I don't think they need 
to block signals.  So I don't find it too restrictive to say "do not block 
or send signals in user-defined functions."  If it's a real concern, it 
should be noted in the manul, rather than writing "do not use pg_ctl 
stop -mi as much as you can, because it can leave hanging backends."

>> How about using SIGKILL instead of SIGQUIT?
>
> Because then we couldn't notify clients at all.  One practical
> disadvantage of that is that it would become quite hard to tell from
> the outside which client session actually crashed, which is frequently
> useful to know.

How is the message below useful to determine which client session actually 
crashed?  The message doesn't contain information about the crashed session. 
Are you talking about log_line_prefix?

ERROR:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the 
current transaction and exit, because another server process exited 
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and 
repeat your command.

However, it is not quickdie() but LogChildExit() that emits useful 
information to tell which session crashed.  So I don't think quickdie()'s 
message is very helpful.


> I think if we want to make it bulletproof we'd have to do what the
> OP suggested and switch to SIGKILL.  I'm not enamored of that for the
> reasons I mentioned --- but one idea that might dodge the disadvantages
> is to have the postmaster wait a few seconds and then SIGKILL any
> backends that hadn't exited.

I believe that SIGKILL is the only and simple way to choose.  Consider 
again: the purpose of "pg_ctl stop -mi" is to immediately and reliably shut 
down the instance.  If it is not reliable, what can we do instead?


Regards
MauMau




On 1/30/13 9:11 AM, MauMau wrote:
> When I ran "pg_ctl stop -mi" against the primary, some applications
> connected to the primary did not stop.  The cause was that the backends
> was deadlocked in quickdie() with some call stack like the following. 
> I'm sorry to have left the stack trace file on the testing machine, so
> I'll show you the precise stack trace tomorrow.

I've had similar problems in the past:

http://www.postgresql.org/message-id/1253704891.20834.8.camel@fsopti579.F-Secure.com

The discussion there never quite concluded.  But yes, you need to be
prepared that in rare circumstances SIGQUIT won't succeed and you need
to use SIGKILL.




From: "Peter Eisentraut" <peter_e@gmx.net>
> On 1/30/13 9:11 AM, MauMau wrote:
>> When I ran "pg_ctl stop -mi" against the primary, some applications
>> connected to the primary did not stop.  The cause was that the backends
>> was deadlocked in quickdie() with some call stack like the following.
>> I'm sorry to have left the stack trace file on the testing machine, so
>> I'll show you the precise stack trace tomorrow.
>
> I've had similar problems in the past:
>
> http://www.postgresql.org/message-id/1253704891.20834.8.camel@fsopti579.F-Secure.com
>
> The discussion there never quite concluded.  But yes, you need to be
> prepared that in rare circumstances SIGQUIT won't succeed and you need
> to use SIGKILL.

Thank you for sharing your experience.  So you also considered making 
postmaster SIGKILL children like me, didn't you?  I bet most of people who 
encounter this problem would feel like that.

It is definitely pg_ctl who needs to be prepared, not the users.  It may not 
be easy to find out postgres processes to SIGKILL if multiple instances are 
running on the same host.  Just doing "pkill postgres" will unexpectedly 
terminate postgres of other instances.

I would like to make a patch which that changes SIGQUIT to SIGKILL when 
postmaster terminates children.  Any other better ideas?

Regards
MauMau




MauMau <maumau307@gmail.com> wrote:

> Just doing "pkill postgres" will unexpectedly terminate postgres
> of other instances.

Not if you run each instance under a different OS user, and execute
pkill with the right user.  (Never use root for that!)  This is
just one of the reasons that you should not run multiple clusters
on the same machine with the same OS user.

-Kevin



Re: Back-branch update releases coming in a couple weeks

From
Andres Freund
Date:
On 2013-01-22 22:19:25 -0500, Tom Lane wrote:
> Since we've fixed a couple of relatively nasty bugs recently, the core
> committee has determined that it'd be a good idea to push out PG update
> releases soon.  The current plan is to wrap on Monday Feb 4 for public
> announcement Thursday Feb 7.  If you're aware of any bug fixes you think
> ought to get included, now's the time to get them done ...

I think the patch in
http://www.postgresql.org/message-id/20130130145521.GB3355@awork2.anarazel.de
should get applied before the release if we decide it should be
backpatched. And I think backpatching makes sense, the amount of useless
full-table-scans will sure hurt more than some workload thats
accidentally fixed by this.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



On 1/31/13 5:42 PM, MauMau wrote:
> Thank you for sharing your experience.  So you also considered making
> postmaster SIGKILL children like me, didn't you?  I bet most of people
> who encounter this problem would feel like that.
>
> It is definitely pg_ctl who needs to be prepared, not the users.  It may
> not be easy to find out postgres processes to SIGKILL if multiple
> instances are running on the same host.  Just doing "pkill postgres"
> will unexpectedly terminate postgres of other instances.

In my case, it was one backend process segfaulting, and then some other
backend processes didn't respond to the subsequent SIGQUIT sent out by
the postmaster.  So pg_ctl didn't have any part in it.

We ended up addressing that by installing a nagios event handler that
checked for this situation and cleaned it up.

> I would like to make a patch which that changes SIGQUIT to SIGKILL when
> postmaster terminates children.  Any other better ideas?

That was my idea back then, but there were some concerns about it.

I found an old patch that I had prepared for this, which I have
attached.  YMMV.

Attachment
On 2013-02-01 08:55:24 -0500, Peter Eisentraut wrote:
> On 1/31/13 5:42 PM, MauMau wrote:
> > Thank you for sharing your experience.  So you also considered making
> > postmaster SIGKILL children like me, didn't you?  I bet most of people
> > who encounter this problem would feel like that.
> > 
> > It is definitely pg_ctl who needs to be prepared, not the users.  It may
> > not be easy to find out postgres processes to SIGKILL if multiple
> > instances are running on the same host.  Just doing "pkill postgres"
> > will unexpectedly terminate postgres of other instances.
> 
> In my case, it was one backend process segfaulting, and then some other
> backend processes didn't respond to the subsequent SIGQUIT sent out by
> the postmaster.  So pg_ctl didn't have any part in it.
> 
> We ended up addressing that by installing a nagios event handler that
> checked for this situation and cleaned it up.
> 
> > I would like to make a patch which that changes SIGQUIT to SIGKILL when
> > postmaster terminates children.  Any other better ideas?
> 
> That was my idea back then, but there were some concerns about it.
> 
> I found an old patch that I had prepared for this, which I have
> attached.  YMMV.

> +static void
> +quickdie_alarm_handler(SIGNAL_ARGS)
> +{
> +    /*
> +     * We got here if ereport() was blocking, so don't go there again
> +     * except when really asked for.
> +     */
> +    elog(DEBUG5, "quickdie aborted by alarm");
> +

Its probably not wise to enter elog.c again, that path might allocate
memory and we wouldn't be any wiser. Unfortunately there's not much
besides a write(2) to stderr that can safely be done...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-02-01 08:55:24 -0500, Peter Eisentraut wrote:
>> I found an old patch that I had prepared for this, which I have
>> attached.  YMMV.

>> +static void
>> +quickdie_alarm_handler(SIGNAL_ARGS)
>> +{
>> +    /*
>> +     * We got here if ereport() was blocking, so don't go there again
>> +     * except when really asked for.
>> +     */
>> +    elog(DEBUG5, "quickdie aborted by alarm");
>> +

> Its probably not wise to enter elog.c again, that path might allocate
> memory and we wouldn't be any wiser. Unfortunately there's not much
> besides a write(2) to stderr that can safely be done...

Another objection to this is it still doesn't fix the case where the
process has blocked SIGQUIT.  My faith that libperl, libR, etc will
never do that is nil.

I think if we want something that's actually trustworthy, the
wait-and-then-kill logic has to be in the postmaster.
        regards, tom lane



Hello, Tom-san, folks,

From: "Tom Lane" <tgl@sss.pgh.pa.us>
> I think if we want to make it bulletproof we'd have to do what the
> OP suggested and switch to SIGKILL.  I'm not enamored of that for the
> reasons I mentioned --- but one idea that might dodge the disadvantages
> is to have the postmaster wait a few seconds and then SIGKILL any
> backends that hadn't exited.
>
> I think if we want something that's actually trustworthy, the
> wait-and-then-kill logic has to be in the postmaster.


I'm sorry to be late to submit a patch.  I made a fix according to the above
comment.  Unfortunately, it was not in time for 9.1.8 release...  I hope
this patch will be included in 9.1.9.

Could you review the patch?  The summary of the change is:
1. postmaster waits for children to terminate when it gets an immediate
shutdown request, instead of exiting.

2. postmaster sends SIGKILL to remaining children if all of the child
processes do not terminate within 10 seconds since the start of immediate
shutdown or FatalError condition.

3. On Windows, kill(SIGKILL) calls TerminateProcess().

4. Documentation change to describe the behavior of immediate shutdown.


Regards
MauMau

Attachment
MauMau escribió:

> Could you review the patch?  The summary of the change is:
> 1. postmaster waits for children to terminate when it gets an
> immediate shutdown request, instead of exiting.
> 
> 2. postmaster sends SIGKILL to remaining children if all of the
> child processes do not terminate within 10 seconds since the start
> of immediate shutdown or FatalError condition.

This seems reasonable.  Why 10 seconds?  We could wait 5 seconds, or 15.
Is there a rationale behind the 10?  If we said 60, that would fit
perfectly well within the already existing 60-second loop in postmaster,
but that seems way too long.

I have only one concern about this patch, which is visible in the
documentation proposed change:
      <para>      This is the <firstterm>Immediate Shutdown</firstterm> mode.      The master
<command>postgres</command>process will send a
 
-      <systemitem>SIGQUIT</systemitem> to all child processes and exit
-      immediately, without properly shutting itself down. The child processes
-      likewise exit immediately upon receiving
-      <systemitem>SIGQUIT</systemitem>. This will lead to recovery (by
+      <systemitem>SIGQUIT</systemitem> to all child processes, wait for
+      them to terminate, and exit. The child processes
+      exit immediately upon receiving
+      <systemitem>SIGQUIT</systemitem>. If any of the child processes
+      does not terminate within 10 seconds for some unexpected reason,
+      the master postgres process will send a <systemitem>SIGKILL</systemitem>
+      to all remaining ones, wait for their termination
+      again, and exit. This will lead to recovery (by      replaying the WAL log) upon next start-up. This is
recommended     only in emergencies.      </para>
 

Note that the previous text said that postmaster will send SIGQUIT, then
terminate without checking anything.  In the new code, postmaster sends
SIGQUIT, then waits, then SIGKILL, then waits again.  If there's an
unkillable process (say because it's stuck in a noninterruptible sleep)
postmaster might never exit.  I think it should send SIGQUIT, then wait,
then SIGKILL, then exit without checking.

I have tweaked the patch a bit and I'm about to commit as soon as we
resolve the above two items.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



First, thank you for the review.

From: "Alvaro Herrera" <alvherre@2ndquadrant.com>
> This seems reasonable.  Why 10 seconds?  We could wait 5 seconds, or 15.
> Is there a rationale behind the 10?  If we said 60, that would fit
> perfectly well within the already existing 60-second loop in postmaster,
> but that seems way too long.

There is no good rationale.  I arbitrarily chose a short period because this 
is "immediate" shutdown.  I felt more than 10 second was long.  I think 5 
second may be better.  Although not directly related to this fix, these 
influenced my choice:

1. According to the man page of init, init sends SIGKILL to all remaining 
processes 5 seconds after it sends SIGTERM to them.

2. At computer shutdown, Windows proceeds shutdown forcibly after waiting 
for services to terminate 20 seconds.


> I have only one concern about this patch, which is visible in the
> documentation proposed change:
>
>       <para>
>       This is the <firstterm>Immediate Shutdown</firstterm> mode.
>       The master <command>postgres</command> process will send a
> -      <systemitem>SIGQUIT</systemitem> to all child processes and exit
> -      immediately, without properly shutting itself down. The child 
> processes
> -      likewise exit immediately upon receiving
> -      <systemitem>SIGQUIT</systemitem>. This will lead to recovery (by
> +      <systemitem>SIGQUIT</systemitem> to all child processes, wait for
> +      them to terminate, and exit. The child processes
> +      exit immediately upon receiving
> +      <systemitem>SIGQUIT</systemitem>. If any of the child processes
> +      does not terminate within 10 seconds for some unexpected reason,
> +      the master postgres process will send a 
> <systemitem>SIGKILL</systemitem>
> +      to all remaining ones, wait for their termination
> +      again, and exit. This will lead to recovery (by
>       replaying the WAL log) upon next start-up. This is recommended
>       only in emergencies.
>       </para>
>
> Note that the previous text said that postmaster will send SIGQUIT, then
> terminate without checking anything.  In the new code, postmaster sends
> SIGQUIT, then waits, then SIGKILL, then waits again.  If there's an
> unkillable process (say because it's stuck in a noninterruptible sleep)
> postmaster might never exit.  I think it should send SIGQUIT, then wait,
> then SIGKILL, then exit without checking.

At first I thought the same, but decided not to do that.  The purpose of 
this patch is to make the immediate shutdown "reliable".  Here, "reliable" 
means that the database server is certainly shut down when pg_ctl returns, 
not telling a lie that "I shut down the server processes for you, so you do 
not have to be worried that some postgres process might still remain and 
write to disk".  I suppose reliable shutdown is crucial especially in HA 
cluster.  If pg_ctl stop -mi gets stuck forever when there is an unkillable 
process (in what situations does this happen? OS bug, or NFS hard mount?), I 
think the DBA has to notice this situation from the unfinished pg_ctl, 
investigate the cause, and take corrective action.  Anyway, in HA cluster, 
the clusterware will terminate the node with STONITH, not leaving pg_ctl 
running forever.


> I have tweaked the patch a bit and I'm about to commit as soon as we
> resolve the above two items.

I appreciate your tweaking, especially in the documentation and source code 
comments, as English is not my mother tongue.

Regards
MauMau




MauMau escribió:
> First, thank you for the review.
> 
> From: "Alvaro Herrera" <alvherre@2ndquadrant.com>
> >This seems reasonable.  Why 10 seconds?  We could wait 5 seconds, or 15.
> >Is there a rationale behind the 10?  If we said 60, that would fit
> >perfectly well within the already existing 60-second loop in postmaster,
> >but that seems way too long.
> 
> There is no good rationale.  I arbitrarily chose a short period
> because this is "immediate" shutdown.  I felt more than 10 second
> was long.  I think 5 second may be better.  Although not directly
> related to this fix, these influenced my choice:
> 
> 1. According to the man page of init, init sends SIGKILL to all
> remaining processes 5 seconds after it sends SIGTERM to them.
> 
> 2. At computer shutdown, Windows proceeds shutdown forcibly after
> waiting for services to terminate 20 seconds.

Well, as for the first of these, I don't think it matters whether
processes get their SIGKILL from postmaster or from init, when a
shutdown or runlevel is changing.  Now, one would think that this sets a
precedent.  I think 20 seconds might be too long; perhaps it's expected
in an operating system that forcibly has a GUI.  I feel safe to ignore
that.

I will go with 5 seconds, then.


> >Note that the previous text said that postmaster will send SIGQUIT, then
> >terminate without checking anything.  In the new code, postmaster sends
> >SIGQUIT, then waits, then SIGKILL, then waits again.  If there's an
> >unkillable process (say because it's stuck in a noninterruptible sleep)
> >postmaster might never exit.  I think it should send SIGQUIT, then wait,
> >then SIGKILL, then exit without checking.
> 
> At first I thought the same, but decided not to do that.  The
> purpose of this patch is to make the immediate shutdown "reliable".

My point is that there is no difference.  For one thing, once we enter
immediate shutdown state, and sigkill has been sent, no further action
is taken.  Postmaster will just sit there indefinitely until processes
are gone.  If we were to make it repeat SIGKILL until they die, that
would be different.  However, repeating SIGKILL is pointless, because it
they didn't die when they first received it, they will still not die
when they receive it second.  Also, if they're in uninterruptible sleep
and don't die, then they will die as soon as they get out of that state;
no further queries will get processed, no further memory access will be
done.  So there's no harm in they remaining there until underlying
storage returns to life, ISTM.

> Here, "reliable" means that the database server is certainly shut
> down when pg_ctl returns, not telling a lie that "I shut down the
> server processes for you, so you do not have to be worried that some
> postgres process might still remain and write to disk".  I suppose
> reliable shutdown is crucial especially in HA cluster.  If pg_ctl
> stop -mi gets stuck forever when there is an unkillable process (in
> what situations does this happen? OS bug, or NFS hard mount?), I
> think the DBA has to notice this situation from the unfinished
> pg_ctl, investigate the cause, and take corrective action.

So you're suggesting that keeping postmaster up is a useful sign that
the shutdown is not going well?  I'm not really sure about this.  What
do others think?

> >I have tweaked the patch a bit and I'm about to commit as soon as we
> >resolve the above two items.
> 
> I appreciate your tweaking, especially in the documentation and
> source code comments, as English is not my mother tongue.

IIRC the only other interesting tweak I did was rename the
SignalAllChildren() function to TerminateChildren().  I did this because
it doesn't really signal all children; syslogger and dead_end backends
are kept around.  So the original name was a bit misleading.  And we
couldn't really name it SignalAlmostAllChildren(), could we ..

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



From: "Alvaro Herrera" <alvherre@2ndquadrant.com>
> I will go with 5 seconds, then.

OK, I agree.


> My point is that there is no difference.  For one thing, once we enter
> immediate shutdown state, and sigkill has been sent, no further action
> is taken.  Postmaster will just sit there indefinitely until processes
> are gone.  If we were to make it repeat SIGKILL until they die, that
> would be different.  However, repeating SIGKILL is pointless, because it
> they didn't die when they first received it, they will still not die
> when they receive it second.  Also, if they're in uninterruptible sleep
> and don't die, then they will die as soon as they get out of that state;
> no further queries will get processed, no further memory access will be
> done.  So there's no harm in they remaining there until underlying
> storage returns to life, ISTM.
>
>> Here, "reliable" means that the database server is certainly shut
>> down when pg_ctl returns, not telling a lie that "I shut down the
>> server processes for you, so you do not have to be worried that some
>> postgres process might still remain and write to disk".  I suppose
>> reliable shutdown is crucial especially in HA cluster.  If pg_ctl
>> stop -mi gets stuck forever when there is an unkillable process (in
>> what situations does this happen? OS bug, or NFS hard mount?), I
>> think the DBA has to notice this situation from the unfinished
>> pg_ctl, investigate the cause, and take corrective action.
>
> So you're suggesting that keeping postmaster up is a useful sign that
> the shutdown is not going well?  I'm not really sure about this.  What
> do others think?

I think you are right, and there is no harm in leaving postgres processes in 
unkillable state.  I'd like to leave the decision to you and/or others.

One concern is that umount would fail in such a situation because postgres 
has some open files on the filesystem, which is on the shared disk in case 
of traditional HA cluster.  However, STONITH should resolve the problem by 
terminating the stuck node...  I just feel it is strange for umount to fail 
due to remaining postgres, because pg_ctl stop -mi reported success.

> IIRC the only other interesting tweak I did was rename the
> SignalAllChildren() function to TerminateChildren().  I did this because
> it doesn't really signal all children; syslogger and dead_end backends
> are kept around.  So the original name was a bit misleading.  And we
> couldn't really name it SignalAlmostAllChildren(), could we ..

I see.  thank you.

Regards
MauMau




On Thu, Jun 20, 2013 at 12:33:25PM -0400, Alvaro Herrera wrote:
> MauMau escribi?:
> > Here, "reliable" means that the database server is certainly shut
> > down when pg_ctl returns, not telling a lie that "I shut down the
> > server processes for you, so you do not have to be worried that some
> > postgres process might still remain and write to disk".  I suppose
> > reliable shutdown is crucial especially in HA cluster.  If pg_ctl
> > stop -mi gets stuck forever when there is an unkillable process (in
> > what situations does this happen? OS bug, or NFS hard mount?), I
> > think the DBA has to notice this situation from the unfinished
> > pg_ctl, investigate the cause, and take corrective action.
> 
> So you're suggesting that keeping postmaster up is a useful sign that
> the shutdown is not going well?  I'm not really sure about this.  What
> do others think?

It would be valuable for "pg_ctl -w -m immediate stop" to have the property
that an subsequent start attempt will not fail due to the presence of some
backend still attached to shared memory.  (Maybe that's true anyway or can be
achieved a better way; I have not investigated.)

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com



Noah Misch escribió:
> On Thu, Jun 20, 2013 at 12:33:25PM -0400, Alvaro Herrera wrote:
> > MauMau escribi?:
> > > Here, "reliable" means that the database server is certainly shut
> > > down when pg_ctl returns, not telling a lie that "I shut down the
> > > server processes for you, so you do not have to be worried that some
> > > postgres process might still remain and write to disk".  I suppose
> > > reliable shutdown is crucial especially in HA cluster.  If pg_ctl
> > > stop -mi gets stuck forever when there is an unkillable process (in
> > > what situations does this happen? OS bug, or NFS hard mount?), I
> > > think the DBA has to notice this situation from the unfinished
> > > pg_ctl, investigate the cause, and take corrective action.
> > 
> > So you're suggesting that keeping postmaster up is a useful sign that
> > the shutdown is not going well?  I'm not really sure about this.  What
> > do others think?
> 
> It would be valuable for "pg_ctl -w -m immediate stop" to have the property
> that an subsequent start attempt will not fail due to the presence of some
> backend still attached to shared memory.  (Maybe that's true anyway or can be
> achieved a better way; I have not investigated.)

Well, the only case where a process that's been SIGKILLed does not go
away, as far as I know, is when it is in some uninterruptible sleep due
to in-kernel operations that get stuck.  Personally I have never seen
this happen in any other case than some network filesystem getting
disconnected, or a disk that doesn't respond.  And whenever the
filesystem starts to respond again, the process gets out of its sleep
only to die due to the signal.

So a subsequent start attempt will either find that the filesystem is
not responding, in which case it'll probably fail to work properly
anyway (presumably the filesystem corresponds to part of the data
directory), or that it has revived in which case the old backends have
already gone away.

If we leave postmaster running after SIGKILLing its children, the only
thing we can do is have it continue to SIGKILL processes continuously
every few seconds until they die (or just sit around doing nothing until
they all die).  I don't think this will have a different effect than
postmaster going away trusting the first SIGKILL to do its job
eventually.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



MauMau escribió:
> From: "Alvaro Herrera" <alvherre@2ndquadrant.com>

> One concern is that umount would fail in such a situation because
> postgres has some open files on the filesystem, which is on the
> shared disk in case of traditional HA cluster.

See my reply to Noah.  If postmaster stays around, would this be any
different?  I don't think so.

> >IIRC the only other interesting tweak I did was rename the
> >SignalAllChildren() function to TerminateChildren().  I did this because
> >it doesn't really signal all children; syslogger and dead_end backends
> >are kept around.  So the original name was a bit misleading.  And we
> >couldn't really name it SignalAlmostAllChildren(), could we ..
> 
> I see.  thank you.

Actually, in further testing I noticed that the fast-path you introduced
in BackendCleanup (or was it HandleChildCrash?) in the immediate
shutdown case caused postmaster to fail to clean up properly after
sending the SIGKILL signal, so I had to remove that as well.  Was there
any reason for that?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Actually, I think it would be cleaner to have a new state in pmState,
namely PM_IMMED_SHUTDOWN which is entered when we send SIGQUIT.  When
we're in this state, postmaster is only waiting for the timeout to
expire; and when it does, it sends SIGKILL and exits.  Pretty much the
same you have, except that instead of checking AbortStartTime we check
the pmState variable.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services






On Thu, Jun 20, 2013 at 3:40 PM, MauMau <maumau307@gmail.com> wrote: 
Here, "reliable" means that the database server is certainly shut
down when pg_ctl returns, not telling a lie that "I shut down the
server processes for you, so you do not have to be worried that some
postgres process might still remain and write to disk".  I suppose
reliable shutdown is crucial especially in HA cluster.  If pg_ctl
stop -mi gets stuck forever when there is an unkillable process (in
what situations does this happen? OS bug, or NFS hard mount?), I
think the DBA has to notice this situation from the unfinished
pg_ctl, investigate the cause, and take corrective action.

So you're suggesting that keeping postmaster up is a useful sign that
the shutdown is not going well?  I'm not really sure about this.  What
do others think?

I think you are right, and there is no harm in leaving postgres processes in unkillable state.  I'd like to leave the decision to you and/or others.


+1 for leaving processes, not waiting for long (or possibly forever, remember not all people are running postgres on such cluster ware).  I'm sure some users rely on the current behavior of immediate shutdown.  If someone wants to ensure processes are finished when pg_ctl returns, is it fast shutdown, not immediate shutdown?  To me the current immediate shutdown is reliable, in that it without doubt returns control back to terminal, after killing postmaster at least.
 
Thanks,
--
Hitoshi Harada
On 2013-06-20 22:36:45 -0400, Alvaro Herrera wrote:
> Noah Misch escribió:
> > On Thu, Jun 20, 2013 at 12:33:25PM -0400, Alvaro Herrera wrote:
> > > MauMau escribi?:
> > > > Here, "reliable" means that the database server is certainly shut
> > > > down when pg_ctl returns, not telling a lie that "I shut down the
> > > > server processes for you, so you do not have to be worried that some
> > > > postgres process might still remain and write to disk".  I suppose
> > > > reliable shutdown is crucial especially in HA cluster.  If pg_ctl
> > > > stop -mi gets stuck forever when there is an unkillable process (in
> > > > what situations does this happen? OS bug, or NFS hard mount?), I
> > > > think the DBA has to notice this situation from the unfinished
> > > > pg_ctl, investigate the cause, and take corrective action.
> > > 
> > > So you're suggesting that keeping postmaster up is a useful sign that
> > > the shutdown is not going well?  I'm not really sure about this.  What
> > > do others think?
> > 
> > It would be valuable for "pg_ctl -w -m immediate stop" to have the property
> > that an subsequent start attempt will not fail due to the presence of some
> > backend still attached to shared memory.  (Maybe that's true anyway or can be
> > achieved a better way; I have not investigated.)
> 
> Well, the only case where a process that's been SIGKILLed does not go
> away, as far as I know, is when it is in some uninterruptible sleep due
> to in-kernel operations that get stuck.  Personally I have never seen
> this happen in any other case than some network filesystem getting
> disconnected, or a disk that doesn't respond.  And whenever the
> filesystem starts to respond again, the process gets out of its sleep
> only to die due to the signal.

Those are the situation in which it takes a really long time, yes. But
there can be timing issues involved. Consider a backend that's currently
stuck in a write() because it hit the dirtying limit.  Say you have a
postgres cluster that's currently slowing down to a crawl because it's
overloaded and hitting the dirty limit. Somebody very well might just
want to restart it with -m immediate. In that case a delay of a second
or two till enough dirty memory has been written that write() can
continue is enough for the postmaster to start up again and try to
attach to shared memory where it will find the shared memory to be still
in use.
I don't really see any argument for *not* waiting. Sure it might take a
bit longer till it's shut down, but if it didn't wait that will cause
problems down the road.

> If we leave postmaster running after SIGKILLing its children, the only
> thing we can do is have it continue to SIGKILL processes continuously
> every few seconds until they die (or just sit around doing nothing until
> they all die).  I don't think this will have a different effect than
> postmaster going away trusting the first SIGKILL to do its job
> eventually.

I think it should just wait till all its child processes are dead. No
need to repeat sending the signals - as you say, that won't help.



What we could do to improve the robustness a bit - at least on linux -
is to prctl(PR_SET_PDEATHSIG, SIGKILL) which would cause children to be
killed if the postmaster goes away...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



From: "Alvaro Herrera" <alvherre@2ndquadrant.com>
> MauMau escribió:
>
>> One concern is that umount would fail in such a situation because
>> postgres has some open files on the filesystem, which is on the
>> shared disk in case of traditional HA cluster.
>
> See my reply to Noah.  If postmaster stays around, would this be any
> different?  I don't think so.

I'll consider this a bit and respond as a reply to Andres-san's mail.


> Actually, in further testing I noticed that the fast-path you introduced
> in BackendCleanup (or was it HandleChildCrash?) in the immediate
> shutdown case caused postmaster to fail to clean up properly after
> sending the SIGKILL signal, so I had to remove that as well.  Was there
> any reason for that?

You are talking about the code below at the beginning of HandleChildCrash(), 
aren't you?
/* Do nothing if the child terminated due to immediate shutdown */if (Shutdown == ImmediateShutdown) return;

If my memory is correct, this was necessary; without this, 
HandleChildCrash() or LogChildExit() left extra messages in the server log.

LOG:  %s (PID %d) exited with exit code %d
LOG:  terminating any other active server processes

These messages are output because postmaster sends SIGQUIT to its children 
and wait for them to terminate.  The children exit with non-zero status when 
they receive SIGQUIT.


Regards
MauMau




From: "Alvaro Herrera" <alvherre@2ndquadrant.com>
> Actually, I think it would be cleaner to have a new state in pmState,
> namely PM_IMMED_SHUTDOWN which is entered when we send SIGQUIT.  When
> we're in this state, postmaster is only waiting for the timeout to
> expire; and when it does, it sends SIGKILL and exits.  Pretty much the
> same you have, except that instead of checking AbortStartTime we check
> the pmState variable.

Are you suggesting simplifying the following part in ServerLoop()?  I 
welcome the idea if this condition becomes simpler.  However, I cannot 
imagine how.
 if (AbortStartTime > 0 &&  /* SIGKILL only once */  (Shutdown == ImmediateShutdown || (FatalError && !SendStop)) &&
now- AbortStartTime >= 10) {  SignalAllChildren(SIGKILL);  AbortStartTime = 0; }
 

I thought of adding some new state of pmState for some reason (that might be 
the same as your idea).
But I refrained from doing that, because pmState has already many states.  I 
was afraid adding a new pmState value for this bug fix would complicate the 
state management (e.g. state transition in PostmasterStateMachine()).  In 
addition, I felt PM_WAIT_BACKENDS was appropriate because postmaster is 
actually waiting for backends to terminate after sending SIGQUIT.  The state 
name is natural.

I don't have strong objection to your idea if it makes the code cleaner and 
more understandable.  Thank you very much.

Regards
MauMau




On Thu, Jun 20, 2013 at 12:33 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> I will go with 5 seconds, then.

I'm uncomfortable with this whole concept, and particularly with such
a short timeout.  On a very busy system, things can take a LOT longer
than they think we should; it can take 30 seconds or more just to get
a prompt back from a shell command.  5 seconds is the blink of an eye.

More generally, what do we think the point is of sending SIGQUIT
rather than SIGKILL in the first place, and why does that point cease
to be valid after 5 seconds?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> More generally, what do we think the point is of sending SIGQUIT
> rather than SIGKILL in the first place, and why does that point cease
> to be valid after 5 seconds?

Well, mostly it's about telling the client we're committing hara-kiri.
Without that, there's no very good reason to run quickdie() at all.

A practical issue with starting to send SIGKILL ourselves is that we
will no longer be able to reflexively diagnose "server process died
on signal 9" as "the linux OOM killer got you".  I'm not at all
convinced that the cases where SIGQUIT doesn't work are sufficiently
common to justify losing that property.
        regards, tom lane



On Fri, Jun 21, 2013 at 2:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> More generally, what do we think the point is of sending SIGQUIT
>> rather than SIGKILL in the first place, and why does that point cease
>> to be valid after 5 seconds?
>
> Well, mostly it's about telling the client we're committing hara-kiri.
> Without that, there's no very good reason to run quickdie() at all.

That's what I thought, too.  It seems to me that if we think that's
important, then it's important even if it takes more than 5 seconds
for some reason.

> A practical issue with starting to send SIGKILL ourselves is that we
> will no longer be able to reflexively diagnose "server process died
> on signal 9" as "the linux OOM killer got you".  I'm not at all
> convinced that the cases where SIGQUIT doesn't work are sufficiently
> common to justify losing that property.

I'm not, either.  Maybe this question will provoke many indignant
responses, but who in their right mind even uses immediate shutdown on
a production server with any regularity?  The shutdown checkpoint is
sometimes painfully long, but do you really want to run recovery just
to avoid it?  And in the rare case where an immediate shutdown fails
to work, what's wrong will "killall -9 postgres"?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Andres Freund escribió:
> On 2013-06-20 22:36:45 -0400, Alvaro Herrera wrote:

> > If we leave postmaster running after SIGKILLing its children, the only
> > thing we can do is have it continue to SIGKILL processes continuously
> > every few seconds until they die (or just sit around doing nothing until
> > they all die).  I don't think this will have a different effect than
> > postmaster going away trusting the first SIGKILL to do its job
> > eventually.
> 
> I think it should just wait till all its child processes are dead. No
> need to repeat sending the signals - as you say, that won't help.

OK, I can buy that.  So postmaster stays around waiting in ServerLoop
until all children are gone; and if any persists for whatever reason,
well, tough.

> What we could do to improve the robustness a bit - at least on linux -
> is to prctl(PR_SET_PDEATHSIG, SIGKILL) which would cause children to be
> killed if the postmaster goes away...

This is an interesting idea (even if it has no relationship to the patch
at hand).

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



<p dir="ltr">The case where I wanted "routine" shutdown immediate (and I'm not sure I ever actually got it) was when we
wereusing IBM HA/CMP, where I wanted a "terminate with a fair bit of prejudice".<p dir="ltr">If we know we want to
"switchright away now", immediate seemed pretty much right.  I was fine with interrupting user sessions, and there
wasn'tas much going on in the way of system background stuff back then.<p dir="ltr">I wasn't keen on waiting on much of
anything. The background writer ought to be keeping things from being too desperately out of date.<p dir="ltr">If
there'sstuff worth waiting a few seconds for, I'm all ears.<p dir="ltr">But if I have to wait arbitrarily long, colour
meunhappy.<p dir="ltr">If I have to distinguish, myself, between a checkpoint nearly done flushing and a backend that's
stuckwaiting forlornly for filesystem access, I'm inclined to "kill -9" and hope recovery doesn't take *too* long on
thenext node...<p dir="ltr">If shutting a server down in an emergency situation requires a DBA to look in, as opposed
toinit.d doing its thing, I think that's pretty much the same problem too. 
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Andres Freund escribi�:
>> What we could do to improve the robustness a bit - at least on linux -
>> is to prctl(PR_SET_PDEATHSIG, SIGKILL) which would cause children to be
>> killed if the postmaster goes away...

> This is an interesting idea (even if it has no relationship to the patch
> at hand).

The traditional theory has been that that would be less robust, not
more so.  Child backends are (mostly) able to carry out queries whether
or not the postmaster is around.  True, you can't make new connections,
but how does killing the existing children make that better?
        regards, tom lane



On Fri, Jun 21, 2013 at 5:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> Andres Freund escribió:
>>> What we could do to improve the robustness a bit - at least on linux -
>>> is to prctl(PR_SET_PDEATHSIG, SIGKILL) which would cause children to be
>>> killed if the postmaster goes away...
>
>> This is an interesting idea (even if it has no relationship to the patch
>> at hand).
>
> The traditional theory has been that that would be less robust, not
> more so.  Child backends are (mostly) able to carry out queries whether
> or not the postmaster is around.

I think that's the Tom Lane theory.  The Robert Haas theory is that if
the postmaster has died, there's no reason to suppose that it hasn't
corrupted shared memory on the way down, or that the system isn't
otherwise heavily fuxxored in some way.

> True, you can't make new connections,
> but how does killing the existing children make that better?

It allows you to start a new postmaster in a timely fashion, instead
of waiting for an idle connection that may not ever terminate without
operator intervention.

Even if it were possible to start a new postmaster that attached to
the existing shared memory segment and began spawning new children, I
think I'd be heavily in favor of killing the old ones off first and
doing a full system reset just for safety.  But it isn't, so what you
get is a crippled system that never recovers without operator
intervention.  And note that I'm not talking about "pg_ctl restart";
that fails because the children have the shmem segment still attached
and the postmaster, which is the only thing listed in the PID file, is
already dead.  I'm talking about "killall -9 postgres", at least.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



From: "Robert Haas" <robertmhaas@gmail.com>On Thu, Jun 20, 2013 at 12:33 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>> I will go with 5 seconds, then.
>
> I'm uncomfortable with this whole concept, and particularly with such
> a short timeout.  On a very busy system, things can take a LOT longer
> than they think we should; it can take 30 seconds or more just to get
> a prompt back from a shell command.  5 seconds is the blink of an eye.

I'm comfortable with 5 seconds.  We are talking about the interval between 
sending SIGQUIT to the children and then sending SIGKILL to them.  In most 
situations, the backends should terminate immediately.  However, as I said a 
few months ago, ereport() call in quickdie() can deadlock indefinitely. 
This is a PostgreSQL's bug.  In addition, Tom san was concerned that some 
PLs (PL/Perl or PL/Python?) block SIGQUIT while executing the UDF, so they 
may not be able to respond to the immediate shutdown request.

What DBAs want from "pg_ctl stop -mi" is to shutdown the database server as 
immediately as possible.  So I think 5 second is reasonable.

Regards
MauMau




From: "Robert Haas" <robertmhaas@gmail.com>
> On Fri, Jun 21, 2013 at 2:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> More generally, what do we think the point is of sending SIGQUIT
>>> rather than SIGKILL in the first place, and why does that point cease
>>> to be valid after 5 seconds?
>>
>> Well, mostly it's about telling the client we're committing hara-kiri.
>> Without that, there's no very good reason to run quickdie() at all.
>
> That's what I thought, too.  It seems to me that if we think that's
> important, then it's important even if it takes more than 5 seconds
> for some reason.

I guess Tom san is saying "we should be as kind as possible to the client, 
so try to notify the client of the shutdown".  Not complete kindness. 
Because the DBA requested immediate shutdown by running "pg_ctl stop -mi", 
the top priority is to shutdown the database server as immediately as 
possible.  The idea here is to try to be friendly to the client as long as 
the DBA can stand.  That's tthe 5 second.


>> A practical issue with starting to send SIGKILL ourselves is that we
>> will no longer be able to reflexively diagnose "server process died
>> on signal 9" as "the linux OOM killer got you".  I'm not at all
>> convinced that the cases where SIGQUIT doesn't work are sufficiently
>> common to justify losing that property.
>
> I'm not, either.  Maybe this question will provoke many indignant
> responses, but who in their right mind even uses immediate shutdown on
> a production server with any regularity?  The shutdown checkpoint is
> sometimes painfully long, but do you really want to run recovery just
> to avoid it?  And in the rare case where an immediate shutdown fails
> to work, what's wrong will "killall -9 postgres"?

Checkpoint is irrelevant here because we are discussing immediate shutdown. 
Some problems with "killall -9 postgres" are:

1. It's not available on Windows.
2. It may kill other database server instances running on the same machine.

Regards
MauMau





Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Jun 21, 2013 at 5:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The traditional theory has been that that would be less robust, not
>> more so.  Child backends are (mostly) able to carry out queries whether
>> or not the postmaster is around.

> I think that's the Tom Lane theory.  The Robert Haas theory is that if
> the postmaster has died, there's no reason to suppose that it hasn't
> corrupted shared memory on the way down, or that the system isn't
> otherwise heavily fuxxored in some way.

Eh?  The postmaster does its level best never to touch shared memory
(after initialization anyway).

>> True, you can't make new connections,
>> but how does killing the existing children make that better?

> It allows you to start a new postmaster in a timely fashion, instead
> of waiting for an idle connection that may not ever terminate without
> operator intervention.

There may be something in that argument, but I find the other one
completely bogus.
        regards, tom lane



On 2013-06-21 23:19:27 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Fri, Jun 21, 2013 at 5:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> The traditional theory has been that that would be less robust, not
> >> more so.  Child backends are (mostly) able to carry out queries whether
> >> or not the postmaster is around.
> 
> > I think that's the Tom Lane theory.  The Robert Haas theory is that if
> > the postmaster has died, there's no reason to suppose that it hasn't
> > corrupted shared memory on the way down, or that the system isn't
> > otherwise heavily fuxxored in some way.
> 
> Eh?  The postmaster does its level best never to touch shared memory
> (after initialization anyway).

I am not sure that will never happen - but I think the chain of argument
misses the main point. Normally we rely on the postmaster to kill off
all other backends if a backend PANICs or segfaults for all the known
reasons. As soon as there's no postmaster anymore we loose that
capability.
And *that* is scary imo. Especially as I would say the chance of getting
PANICs or segfaults increases if there's no postmaster anymore since we
might reach code branches we otherwise won't.

> >> True, you can't make new connections,
> >> but how does killing the existing children make that better?
> 
> > It allows you to start a new postmaster in a timely fashion, instead
> > of waiting for an idle connection that may not ever terminate without
> > operator intervention.

And it's no always easy to figure out which cluster those backends
belong to if there are multiple postgres instances running as the same user.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



MauMau escribió:

> Are you suggesting simplifying the following part in ServerLoop()?
> I welcome the idea if this condition becomes simpler.  However, I
> cannot imagine how.

>  if (AbortStartTime > 0 &&  /* SIGKILL only once */
>   (Shutdown == ImmediateShutdown || (FatalError && !SendStop)) &&
>   now - AbortStartTime >= 10)
>  {
>   SignalAllChildren(SIGKILL);
>   AbortStartTime = 0;
>  }

Yes, that's what I'm suggesting.  I haven't tried coding it yet.

> I thought of adding some new state of pmState for some reason (that
> might be the same as your idea).
> But I refrained from doing that, because pmState has already many
> states.  I was afraid adding a new pmState value for this bug fix
> would complicate the state management (e.g. state transition in
> PostmasterStateMachine()).  In addition, I felt PM_WAIT_BACKENDS was
> appropriate because postmaster is actually waiting for backends to
> terminate after sending SIGQUIT.  The state name is natural.

Well, a natural state name is of no use if we're using it to indicate
two different states, which I think is what would be happening here.
Basically, with your patch, PM_WAIT_BACKENDS means two different things
depending on AbortStartTime.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



On 2013-06-21 16:56:57 -0400, Alvaro Herrera wrote:
> > What we could do to improve the robustness a bit - at least on linux -
> > is to prctl(PR_SET_PDEATHSIG, SIGKILL) which would cause children to be
> > killed if the postmaster goes away...
> 
> This is an interesting idea (even if it has no relationship to the patch
> at hand).

Well, we could just set the deathsig to SIGKILL and exit normally -
which would be a twoliner or so.
Admittedly the cross-platform issue makes this approach not so great...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



On Fri, Jun 21, 2013 at 11:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think that's the Tom Lane theory.  The Robert Haas theory is that if
>> the postmaster has died, there's no reason to suppose that it hasn't
>> corrupted shared memory on the way down, or that the system isn't
>> otherwise heavily fuxxored in some way.
>
> Eh?  The postmaster does its level best never to touch shared memory
> (after initialization anyway).

And yet it certainly does - see pmsignal.c, for example.  Besides
which, as Andres points out, if the postmaster is dead, there is zip
for a guarantee that some OTHER backend hasn't panicked.  I think it's
just ridiculous to suppose that the system can run in any sort of
reasonable way without the postmaster.  The whole reason why we work
so hard to make sure that the postmaster doesn't die in the first
place is because we need it to clean up when things go horribly wrong.If that cleanup function is important, then we
needa living
 
postmaster at all times.  If it's not important, then our extreme
paranoia about what operations the postmaster is permitted to engage
in is overblown.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Fri, Jun 21, 2013 at 10:02 PM, MauMau <maumau307@gmail.com> wrote:
> I'm comfortable with 5 seconds.  We are talking about the interval between
> sending SIGQUIT to the children and then sending SIGKILL to them.  In most
> situations, the backends should terminate immediately.  However, as I said a
> few months ago, ereport() call in quickdie() can deadlock indefinitely. This
> is a PostgreSQL's bug.

So let's fix that bug.  Then we don't need this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



From: "Alvaro Herrera" <alvherre@2ndquadrant.com>
> MauMau escribió:
>> I thought of adding some new state of pmState for some reason (that
>> might be the same as your idea).
>> But I refrained from doing that, because pmState has already many
>> states.  I was afraid adding a new pmState value for this bug fix
>> would complicate the state management (e.g. state transition in
>> PostmasterStateMachine()).  In addition, I felt PM_WAIT_BACKENDS was
>> appropriate because postmaster is actually waiting for backends to
>> terminate after sending SIGQUIT.  The state name is natural.
>
> Well, a natural state name is of no use if we're using it to indicate
> two different states, which I think is what would be happening here.
> Basically, with your patch, PM_WAIT_BACKENDS means two different things
> depending on AbortStartTime.

i PROBABLY GOT YOUR FEELING.  yOU AREN'T FEELING COMFORTABLE ABOUT USING THE 
TIME VARIABLE aBORTsTARTtIME AS A STATE VARIABLE TO CHANGE POSTMASTER'S 
BEHAVIOR, ARE YOU?

tHAT MAY BE RIGHT, BUT i'M NOT SURE WELL... BECAUSE IN pm_wait_backends, AS 
THE NAME INDICATES, POSTMASTER IS INDEED WAITING FOR THE BACKENDS TO 
TERMINATE REGARDLESS OF aBORTsTARTtIME.  aPART FROM THIS, POSTMASTER SEEMS 
TO CHANGE ITS BEHAVIOR IN THE SAME PMsTATE DEPENDING ON OTHER VARIABLES SUCH 
AS sHUTDOWN AND fATALeRROR.  i'M NOT CONFIDENT IN WHICH IS BETTER, SO i 
WON'T OBJECT IF THE REVIEWER OR COMMITTER MODIFIES THE CODE.

rEGARDS
mAUmAU




From: "Robert Haas" <robertmhaas@gmail.com>
> On Fri, Jun 21, 2013 at 10:02 PM, MauMau <maumau307@gmail.com> wrote:
>> I'm comfortable with 5 seconds.  We are talking about the interval 
>> between
>> sending SIGQUIT to the children and then sending SIGKILL to them.  In 
>> most
>> situations, the backends should terminate immediately.  However, as I 
>> said a
>> few months ago, ereport() call in quickdie() can deadlock indefinitely. 
>> This
>> is a PostgreSQL's bug.
>
> So let's fix that bug.  Then we don't need this.

tHERE ARE TWO WAYS TO FIX THAT BUG.  yOU ARE SUGGESTING 1 OF THE FOLLOWING 
TWO METHODS, AREN'T YOU?

1. (rOBERT-SAN'S IDEA)
uPON RECEIPT OF IMMEDIATE SHUTDOWN REQUEST, POSTMASTER SENDS sigkill TO ITS 
CHILDREN.

2. (tOM-SAN'S IDEA)
uPON RECEIPT OF IMMEDIATE SHUTDOWN REQUEST, POSTMASTER FIRST SENDS sigquit 
TO ITS CHILDREN, WAIT A WHILE FOR THEM TO TERMINATE, THEN SENDS sigkill TO 
THEM.


aT FIRST i PROPOSED 1.  tHEN tOM SAN SUGGESTED 2 SO THAT THE SERVER IS AS 
FRIENDLY TO THE CLIENTS AS NOW BY NOTIFYING THEM OF THE SERVER SHUTDOWN.  i 
WAS CONVINCED BY THAT IDEA AND CHOSE 2.

bASICALLY, i THINK BOTH IDEAS ARE RIGHT.  tHEY CAN SOLVE THE ORIGINAL 
PROBLEM.

hOWEVER, RE-CONSIDERING THE MEANING OF "IMMEDIATE" SHUTDOWN, i FEEL 1 IS A 
BIT BETTER, BECAUSE TRYING TO DO SOMETHING IN QUICKDIE() OR SOMEWHERE DOES 
NOT MATCH THE IDEA OF "IMMEDIATE". wE MAY NOT HAVE TO BE FRIENDLY TO THE 
CLIENTS AT THE IMMEDIATE SHUTDOWN.  tHE CODE GETS MUCH SIMPLER.  iN 
ADDITION, IT MAY BE BETTER THAT WE SIMILARLY SEND sigkill IN BACKEND CRASH 
(fATALeRROR) CASE, ELIMINATE THE USE OF sigquit AND REMOVE QUICKDIE() AND 
OTHER sigquit HANDLERS.

wHAT DO YOU THINK?  hOW SHOULD WE MAKE CONSENSUS AND PROCEED?

rEGARDS
mAUmAU





On Sun, Jun 23, 2013 at 01:55:19PM +0900, MauMau wrote:
> From: "Robert Haas" <robertmhaas@gmail.com>
>> On Fri, Jun 21, 2013 at 10:02 PM, MauMau <maumau307@gmail.com> wrote:
>>> I'm comfortable with 5 seconds.  We are talking about the interval  
>>> between
>>> sending SIGQUIT to the children and then sending SIGKILL to them.  In 
>>> most
>>> situations, the backends should terminate immediately.  However, as I 
>>> said a
>>> few months ago, ereport() call in quickdie() can deadlock 
>>> indefinitely. This
>>> is a PostgreSQL's bug.
>>
>> So let's fix that bug.  Then we don't need this.

[quoted part filtered to undo caps lock]
> There are two ways to fix that bug.  You are suggesting 1 of the 
> following two methods, aren't you?

I suspect Robert meant to prevent the deadlock by limiting quickdie() to
calling only async-signal-safe functions.  Implementing that looked grotty
when we last discussed it, though, and it would not help against a library or
trusted PL function suspending SIGQUIT.

> 1. (Robert-san's idea)
> Upon receipt of immediate shutdown request, postmaster sends SIGKILL to 
> its children.
>
> 2. (Tom-san's idea)
> Upon receipt of immediate shutdown request, postmaster first sends 
> SIGQUIT to its children, wait a while for them to terminate, then sends 
> SIGKILL to them.
>
>
> At first I proposed 1.  Then Tom san suggested 2 so that the server is as 
> friendly to the clients as now by notifying them of the server shutdown.  
> I was convinced by that idea and chose 2.
>
> Basically, I think both ideas are right.  They can solve the original  
> problem.

Agreed.

> However, re-considering the meaning of "immediate" shutdown, I feel 1 is 
> a bit better, because trying to do something in quickdie() or somewhere 
> does not match the idea of "immediate". We may not have to be friendly to 

Perhaps so, but more important than the definition of the word "immediate" is
what an immediate shutdown has long meant in PostgreSQL.  All this time it has
made some effort to send a message to the client.  Note also that the patch
under discussion affects both immediate shutdowns and the automatic reset in
response to a backend crash.  I think the latter is the more-important use
case, and the message is nice to have there.

> the clients at the immediate shutdown.  The code gets much simpler.  In  
> addition, it may be better that we similarly send SIGKILL in backend 
> crash (FatalError) case, eliminate the use of SIGQUIT and remove 
> quickdie() and other SIGQUIT handlers.

My take is that the client message has some value, and we shouldn't just
discard it to simplify the code slightly.  Finishing the shutdown quickly has
value, of course.  The relative importance of those things should guide the
choice of a timeout under method #2, but I don't see a rigorous way to draw
the line.  I feel five seconds is, if anything, too high.

What about using deadlock_timeout?  It constitutes a rough barometer on the
system's tolerance for failure detection latency, and we already overload it
by having it guide log_lock_waits.  The default of 1s makes sense to me for
this purpose, too.  We can always add a separate immediate_shutdown_timeout if
there's demand, but I don't expect such demand.  (If we did add such a GUC,
setting it to zero could be the way to request method 1.  If some folks
strongly desire method 1, that's probably the way to offer it.)

Thanks,
nm

-- 
Noah Misch
EnterpriseDB                                 http://www.enterprisedb.com



MauMau escribió:
> From: "Alvaro Herrera" <alvherre@2ndquadrant.com>

> >Actually, in further testing I noticed that the fast-path you introduced
> >in BackendCleanup (or was it HandleChildCrash?) in the immediate
> >shutdown case caused postmaster to fail to clean up properly after
> >sending the SIGKILL signal, so I had to remove that as well.  Was there
> >any reason for that?
>
> You are talking about the code below at the beginning of
> HandleChildCrash(), aren't you?
>
> /* Do nothing if the child terminated due to immediate shutdown */
> if (Shutdown == ImmediateShutdown)
>  return;
>
> If my memory is correct, this was necessary; without this,
> HandleChildCrash() or LogChildExit() left extra messages in the
> server log.
>
> LOG:  %s (PID %d) exited with exit code %d
> LOG:  terminating any other active server processes
>
> These messages are output because postmaster sends SIGQUIT to its
> children and wait for them to terminate.  The children exit with
> non-zero status when they receive SIGQUIT.

Yeah, I see that --- after removing that early exit, there are unwanted
messages.  And in fact there are some signals sent that weren't
previously sent.  Clearly we need something here: if we're in immediate
shutdown handler, don't signal anyone (because they have already been
signalled) and don't log any more messages; but the cleaning up of
postmaster's process list must still be carried out.

Would you please add that on top of the attached cleaned up version of
your patch?


Noah Misch escribió:
> On Sun, Jun 23, 2013 at 01:55:19PM +0900, MauMau wrote:

> > the clients at the immediate shutdown.  The code gets much simpler.  In
> > addition, it may be better that we similarly send SIGKILL in backend
> > crash (FatalError) case, eliminate the use of SIGQUIT and remove
> > quickdie() and other SIGQUIT handlers.
>
> My take is that the client message has some value, and we shouldn't just
> discard it to simplify the code slightly.  Finishing the shutdown quickly has
> value, of course.  The relative importance of those things should guide the
> choice of a timeout under method #2, but I don't see a rigorous way to draw
> the line.  I feel five seconds is, if anything, too high.

There's obviously a lot of disagreement on 5 seconds being too high or
too low.  We have just followed SysV init's precedent of waiting 5
seconds by default between sending signals TERM and QUIT during a
shutdown.  I will note that during a normal shutdown, services are
entitled to do much more work than just signal all their children to
exit immediately; and yet I don't find much evidence that this period is
inordinately short.  I don't feel strongly that it couldn't be shorter,
though.

> What about using deadlock_timeout?  It constitutes a rough barometer on the
> system's tolerance for failure detection latency, and we already overload it
> by having it guide log_lock_waits.  The default of 1s makes sense to me for
> this purpose, too.  We can always add a separate immediate_shutdown_timeout if
> there's demand, but I don't expect such demand.  (If we did add such a GUC,
> setting it to zero could be the way to request method 1.  If some folks
> strongly desire method 1, that's probably the way to offer it.)

I dunno.  Having this be configurable seems overkill to me.  But perhaps
that's the way to satisfy most people: some people could set it very
high so that they could have postmaster wait longer if they believe
their server is going to be overloaded; people wishing immediate SIGKILL
could get that too, as you describe.

I think this should be a separate patch, however.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment
From: "Alvaro Herrera" <alvherre@2ndquadrant.com>
> Yeah, I see that --- after removing that early exit, there are unwanted
> messages.  And in fact there are some signals sent that weren't
> previously sent.  Clearly we need something here: if we're in immediate
> shutdown handler, don't signal anyone (because they have already been
> signalled) and don't log any more messages; but the cleaning up of
> postmaster's process list must still be carried out.
>
> Would you please add that on top of the attached cleaned up version of
> your patch?

Thanks.  I'll do that tomorrow at the earliest.

> Noah Misch escribió:
>> On Sun, Jun 23, 2013 at 01:55:19PM +0900, MauMau wrote:
>
>> > the clients at the immediate shutdown.  The code gets much simpler.  In
>> > addition, it may be better that we similarly send SIGKILL in backend
>> > crash (FatalError) case, eliminate the use of SIGQUIT and remove
>> > quickdie() and other SIGQUIT handlers.
>>
>> My take is that the client message has some value, and we shouldn't just
>> discard it to simplify the code slightly.  Finishing the shutdown quickly 
>> has
>> value, of course.  The relative importance of those things should guide 
>> the
>> choice of a timeout under method #2, but I don't see a rigorous way to 
>> draw
>> the line.  I feel five seconds is, if anything, too high.
>
> There's obviously a lot of disagreement on 5 seconds being too high or
> too low.  We have just followed SysV init's precedent of waiting 5
> seconds by default between sending signals TERM and QUIT during a
> shutdown.  I will note that during a normal shutdown, services are
> entitled to do much more work than just signal all their children to
> exit immediately; and yet I don't find much evidence that this period is
> inordinately short.  I don't feel strongly that it couldn't be shorter,
> though.
>
>> What about using deadlock_timeout?  It constitutes a rough barometer on 
>> the
>> system's tolerance for failure detection latency, and we already overload 
>> it
>> by having it guide log_lock_waits.  The default of 1s makes sense to me 
>> for
>> this purpose, too.  We can always add a separate 
>> immediate_shutdown_timeout if
>> there's demand, but I don't expect such demand.  (If we did add such a 
>> GUC,
>> setting it to zero could be the way to request method 1.  If some folks
>> strongly desire method 1, that's probably the way to offer it.)
>
> I dunno.  Having this be configurable seems overkill to me.  But perhaps
> that's the way to satisfy most people: some people could set it very
> high so that they could have postmaster wait longer if they believe
> their server is going to be overloaded; people wishing immediate SIGKILL
> could get that too, as you describe.
>
> I think this should be a separate patch, however.

I think so, too.  We can add a parameter later if we find it highly 
necessary after some experience in the field.


Regards
MauMau





Hi, Alvaro san,

From: "Alvaro Herrera" <alvherre@2ndquadrant.com>
> MauMau escribió:
> Yeah, I see that --- after removing that early exit, there are unwanted
> messages.  And in fact there are some signals sent that weren't
> previously sent.  Clearly we need something here: if we're in immediate
> shutdown handler, don't signal anyone (because they have already been
> signalled) and don't log any more messages; but the cleaning up of
> postmaster's process list must still be carried out.
>
> Would you please add that on top of the attached cleaned up version of
> your patch?

I did this.  Please find attached the revised patch.  I modified
HandleChildCrash().  I tested the immediate shutdown, and the child cleanup
succeeded.

In addition, I added if condition at the end of the function.  This is to
prevent resetting AbortStartTime every time one child terminates.

Regards
MauMau


Attachment
MauMau escribió:

Hi,

> I did this.  Please find attached the revised patch.  I modified
> HandleChildCrash().  I tested the immediate shutdown, and the child
> cleanup succeeded.

Thanks, committed.

There are two matters pending here:

1. do we want postmaster to exit immediately after sending the SIGKILL,
or hang around until all children are gone?

2. how far do we want to backpatch this, if at all?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



On Fri, Jun 28, 2013 at 6:00 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> MauMau escribió:
>
> Hi,
>
>> I did this.  Please find attached the revised patch.  I modified
>> HandleChildCrash().  I tested the immediate shutdown, and the child
>> cleanup succeeded.
>
> Thanks, committed.
>
> There are two matters pending here:
>
> 1. do we want postmaster to exit immediately after sending the SIGKILL,
> or hang around until all children are gone?
>
> 2. how far do we want to backpatch this, if at all?

Considering that neither Tom nor I was convinced that this was a good
idea AT ALL, I'm surprised you committed it in the first place.  I'd
be more inclined to revert it than back-patch it, but at least if we
only change it in HEAD we have some chance of finding out whether or
not it is in fact a bad idea before it's too late to change our mind.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company