Home > mailing lists

Re: Logical replication launcher did not automatically restart when got SIGKILL - Mailing list pgsql-hackers

From	Fujii Masao
Subject	Re: Logical replication launcher did not automatically restart when got SIGKILL
Date	July 15, 2025 18:08:13
Msg-id	1e1746d4-6273-4eb7-a870-912bf39662ee@oss.nttdata.com Whole thread Raw
In response to	Re: Logical replication launcher did not automatically restart when got SIGKILL (shveta malik <shveta.malik@gmail.com>)
Responses	Re: Logical replication launcher did not automatically restart when got SIGKILL
List	pgsql-hackers

Tree view

On 2025/07/15 19:34, shveta malik wrote:
> On Tue, Jul 15, 2025 at 2:56 PM cca5507 <cca5507@qq.com> wrote:
>>
>> Hi, hackers
>>
>> I found the $SUBJECT, the main reason is that RegisteredBgWorker::rw_pid has not been cleaned.
>>
>> Attach a patch to fix it.

Thanks for the report!

This issue appears to have been introduced by commit 28a520c0b77. As a result,
not only the logical replication launcher but also other background workers
(like autoprewarm) may fail to restart after a server crash.

> Thank You for reporting this. The problem exists and the patch works
> as expected.
> 
> In the patch, we are resetting the PID during shared memory
> initialization. Is there a better place to handle PID reset in the
> case of a SIGKILL, possibly within a cleanup flow? For example, during
> a regular shutdown, we reset the launcher (background worker) PID in
> CleanupBackend(). Or is this the only possibility?

 From a quick look at the code, it seems that the second half of CleanupBackend()
is responsible for cleaning up background workers and resetting rw_pid to 0.
However, in the crash case, the function exits immediately after calling
HandleChildCrash(), skipping that cleanup:

    if (crashed)
    {
        HandleChildCrash(bp_pid, exitstatus, procname);
        return;
    }

This could be the problem? Shouldn't the background worker cleanup still
happen even in the crash case?

Regards,

-- 
Fujii Masao
NTT DATA Japan Corporation

pgsql-hackers by date:

From: Junwang Zhao
Date: 15 July 2025, 18:06:11
Subject: remove WITHOUT OIDS syntax for v19

From: Tom Lane
Date: 15 July 2025, 18:10:42
Subject: Re: [PATCH] Generate random dates/times in a specified range

Re: Logical replication launcher did not automatically restart when got SIGKILL - Mailing list pgsql-hackers

Previous

Next