Thread: Re: [HACKERS] strange parallel query behavior after OOM crashes

Re: [HACKERS] strange parallel query behavior after OOM crashes

From
Robert Haas
Date:
[ Adding Julien, whose patch this was. ]

On Thu, Apr 6, 2017 at 5:34 AM, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote:
> While performing StartupDatabase, both master and standby server
> behave in similar way till postmaster spawns startup process.
> In master, startup process completes its job and dies. As a result,
> reaper is called which in turn calls maybe_start_bgworker().
> In standby, after getting a valid snapshot, startup process sends
> postmaster a signal to enable connections. Signal handler in
> postmaster calls maybe_start_bgworker().
> In maybe_start_bgworker(), if we find a crashed bgworker(crashed_at !=
> 0) with a NEVER RESTART flag, we call ForgetBackgroundWorker().to
> forget the bgworker process.
>
> I've attached the patch for adding an argument in
> ForgetBackgroundWorker() to indicate a crashed situation. Based on
> that, we can take the necessary actions. I've not included the Assert
> statement in this patch.

This doesn't seem right, because this will potentially pass wasCrashed
= true even if there has been no crash-and-restart cycle.  The place
where you're passing "true" only knows that rw->rw_crashed_at != 0 &&
rw->rw_worker.bgw_restart_time == BGW_NEVER_RESTART, but that doesn't
prove much of anything.  Some *other* background worker could have
crashed, rather than the one at issue.  Even there's only one worker
involved, the test for whether to call HandleChildCrash() involves
other factors:

        if ((rw->rw_worker.bgw_flags & BGWORKER_SHMEM_ACCESS) != 0)
        {
            if (!EXIT_STATUS_0(exitstatus) && !EXIT_STATUS_1(exitstatus))
            {
                HandleChildCrash(pid, exitstatus, namebuf);
                return true;
            }
        }

        if (!ReleasePostmasterChildSlot(rw->rw_child_slot) &&
            (rw->rw_worker.bgw_flags & BGWORKER_SHMEM_ACCESS) != 0)
        {
            HandleChildCrash(pid, exitstatus, namebuf);
            return true;
        }

After some study, I think the solution here is as follows:

1. Forget BGW_NEVER_RESTART workers in
ResetBackgroundWorkerCrashTimes() rather than leaving them around to
be cleaned up after the conclusion of the restart, so that they go
away before rather than after shared memory is reset.

2. Don't allow BGWORKER_CLASS_PARALLEL workers to be anything other
than BGW_NEVER_RESTART, so that (1) suffices to guarantee that, after
recovering from a crash, 0 is the correct starting value for
parallel_register_count.  (Alternatively, we could iterate through the
worker list and compute the correct starting value for
parallel_register_count, but that seems silly since we only ever
expect BGW_NEVER_RESTART here anyway.)

The attached patch seems to fix the problem for me.  I'll commit it
tomorrow unless somebody sees a problem with it; if that happens, I'll
provide some other kind of update tomorrow.  [For clarity, for
official status update purposes, I am setting a next-update date of
tomorrow, 2017-04-11.]

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] strange parallel query behavior after OOM crashes

From
Neha Khatri
Date:
On Tue, Apr 11, 2017 at 1:16 AM, Robert Haas <robertmhaas@gmail.com> wrote:

1. Forget BGW_NEVER_RESTART workers in
ResetBackgroundWorkerCrashTimes() rather than leaving them around to
be cleaned up after the conclusion of the restart, so that they go
away before rather than after shared memory is reset.

Now with this, would it still be required to forget BGW_NEVER_RESTART workers in maybe_start_bgworker():

                if (rw->rw_crashed_at != 0)
{
if (rw->rw_worker.bgw_restart_time == BGW_NEVER_RESTART)
{
ForgetBackgroundWorker(&iter);
continue;
}
                     ......
                }

Regards,
Neha

Re: [HACKERS] strange parallel query behavior after OOM crashes

From
Robert Haas
Date:
On Mon, Apr 10, 2017 at 2:32 PM, Neha Khatri <nehakhatri5@gmail.com> wrote:
> On Tue, Apr 11, 2017 at 1:16 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> 1. Forget BGW_NEVER_RESTART workers in
>> ResetBackgroundWorkerCrashTimes() rather than leaving them around to
>> be cleaned up after the conclusion of the restart, so that they go
>> away before rather than after shared memory is reset.
>
> Now with this, would it still be required to forget BGW_NEVER_RESTART
> workers in maybe_start_bgworker():
>
>                 if (rw->rw_crashed_at != 0)
> {
> if (rw->rw_worker.bgw_restart_time == BGW_NEVER_RESTART)
> {
> ForgetBackgroundWorker(&iter);
> continue;
> }
>                      ......
>                 }

Well, as noted before, not every background worker failure leads to a
crash-and-restart; for example, a non-shmem-connected worker or one
that exits with status 1 will set rw_crashed_at but will not cause a
crash-and-restart cycle.   It's not 100% clear to me whether the code
you're talking about can be reached in those cases, but I wouldn't bet
against it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] strange parallel query behavior after OOM crashes

From
Kuntal Ghosh
Date:
On Tue, Apr 11, 2017 at 2:36 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Apr 10, 2017 at 2:32 PM, Neha Khatri <nehakhatri5@gmail.com> wrote:
>> On Tue, Apr 11, 2017 at 1:16 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> 1. Forget BGW_NEVER_RESTART workers in
>>> ResetBackgroundWorkerCrashTimes() rather than leaving them around to
>>> be cleaned up after the conclusion of the restart, so that they go
>>> away before rather than after shared memory is reset.
>>
+1 for the solution.

>> Now with this, would it still be required to forget BGW_NEVER_RESTART
>> workers in maybe_start_bgworker():
>>
>>                 if (rw->rw_crashed_at != 0)
>> {
>> if (rw->rw_worker.bgw_restart_time == BGW_NEVER_RESTART)
>> {
>> ForgetBackgroundWorker(&iter);
>> continue;
>> }
>>                      ......
>>                 }
>
> Well, as noted before, not every background worker failure leads to a
> crash-and-restart; for example, a non-shmem-connected worker or one
> that exits with status 1 will set rw_crashed_at but will not cause a
> crash-and-restart cycle.   It's not 100% clear to me whether the code
> you're talking about can be reached in those cases, but I wouldn't bet
> against it.
I think that for above-mentioned background workers, we follow a
different path to call ForgetBackgroundWorker().
CleanupBackgroundWorker()
- ReportBackgroundWorkerExit()- ForgetBackgroundWorker()
But, I'm not sure for any other cases.

Should we include the assert statement as well?

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com