Re: Refactoring postmaster's code to cleanup after child exit - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Refactoring postmaster's code to cleanup after child exit
Date
Msg-id e9e58efa-a364-4f4f-b112-a11d62af952a@iki.fi
Whole thread Raw
In response to Re: Refactoring postmaster's code to cleanup after child exit  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Refactoring postmaster's code to cleanup after child exit
List pgsql-hackers
On 08/08/2024 13:47, Thomas Munro wrote:
>      On Windows, if a child process exits with ERROR_WAIT_NO_CHILDREN, it's
>      now logged with that exit code, instead of 0. Also, if a bgworker
>      exits with ERROR_WAIT_NO_CHILDREN, it's now treated as crashed and is
>      restarted. Previously it was treated as a normal exit.
> 
> Interesting.  So when that error was first specially handled in this thread:
> 
>
https://www.postgresql.org/message-id/flat/AANLkTimCTkNKKrHCd3Ot6kAsrSS7SeDpOTcaLsEP7i%2BM%40mail.gmail.com#41f60947571b75377f04af67ba6baf40
> 
> ... it went from being considered a crash, to being considered like
> exit(0).  It's true that shared memory can't be corrupted by a process
> that never enters main(), but it's better not to hide the true reason
> for the failure (if it is still possible -- I don't find many
> references to that phenomenon in recent times).  Clobbering exitstatus
> with 0 doesn't seem right at all, now that we have background workers
> whose restart behaviour is affected by that.

I adjusted this ERROR_WAIT_NO_CHILDREN a little more, to avoid logging 
the death of the child twice in some cases.

>> * v3-0003-Fix-comment-on-processes-being-kept-over-a-restar.patch
> 
> +1

Committed the patches up to and including this one, with tiny comment 
changes.

>> * v3-0004-Consolidate-postmaster-code-to-launch-background-.patch
> 
>      Much of the code in process_pm_child_exit() to launch replacement
>      processes when one exits or when progressing to next postmaster state
>      was unnecessary, because the ServerLoop will launch any missing
>      background processes anyway. Remove the redundant code and let
>      ServerLoop handle it.

I'm going to work a little more on the comments on this one before 
committing; I had just moved all the "If we have lost the XXX, try to 
start a new one" comments as is, but they look pretty repetitive now.

Thanks for the review!

-- 
Heikki Linnakangas
Neon (https://neon.tech)




pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Adding skip scan (including MDAM style range skip scan) to nbtree
Next
From: Thomas Munro
Date:
Subject: Re: Remaining dependency on setlocale()