Re: bgworker crashed or not? - Mailing list pgsql-hackers

From Petr Jelinek
Subject Re: bgworker crashed or not?
Date
Msg-id 536A9DCE.7060304@2ndquadrant.com
Whole thread Raw
In response to Re: bgworker crashed or not?  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: bgworker crashed or not?
List pgsql-hackers
On 07/05/14 22:32, Robert Haas wrote:
> On Tue, May 6, 2014 at 8:25 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
> I've committed the portion of your patch which does this, with pretty
> extensive changes.  I moved the function which resets the crash times
> to bgworker.c, renamed it, and made it just reset all of the crash
> times unconditionally; there didn't seem to be any point in skipping
> the irrelevant ones, so it seemed best to keep things simple.  I also
> moved the call from reaper() where you had it to
> PostmasterStateMachine(), because the old placement did not work for
> me when I carried out the following steps:
>
> 1. Kill a background worker with a 60-second restart time using
> SIGTERM, so that it does exit(1).
> 2. Before it restarts, kill a regular backend with SIGQUIT.
>
> Let me know if you think I've got it wrong.
>

No I think it's fine, I didn't try that combination and just wanted to
put it as deep in the call as possible.


>>> (2) If a shmem-connected backend fails to release the deadman switch
>>> or exits with an exit code other than 0 or 1, we crash-and-restart.  A
>>> non-shmem-connected backend never causes a crash-and-restart.
>>
>> +1
>
> I did this as a separate commit,
> e2ce9aa27bf20eff2d991d0267a15ea5f7024cd7, just moving the check for
> ReleasePostmasterChildSlot inside the if statement.   Then I realized
> that was bogus, so I just pushed
> eee6cf1f337aa488a20e9111df446cdad770e645 to fix that.  Hopefully it's
> OK now.

The fixed one looks ok to me.

>
>>> (3) When a background worker exits without triggering a
>>> crash-and-restart, an exit code of precisely 0 causes the worker to be
>>> unregistered; any other exit code has no special effect, so
>>> bgw_restart_time controls.
>>
>> +1
>
> This isn't done yet.
>

Unless I am missing something this change was included in every patch I
sent - setting rw->rw_terminate = true; in CleanupBackgroundWorker for
zero exit code + comment changes. Or do you have objections to this
approach?

Anyway missing parts attached.

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: jsonb existence queries are misimplemented by jsonb_ops
Next
From: Peter Geoghegan
Date:
Subject: Re: jsonb existence queries are misimplemented by jsonb_ops