On 2023-03-13 09:21:18 -0800, Israel Brewster wrote:
> I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit more
> memory constrained than I would like, such that every week or so the various
> processes running on the machine will align badly and the OOM killer will kick
> in, killing off postgresql, as per the following journalctl output:
>
> Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A process of
> this unit has been killed by the OOM killer.
> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed with
> result 'oom-kill'.
> Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Consumed 5d
> 17h 48min 24.509s CPU time.
>
> And the service is no longer running.
I might be misreading this, but it looks to me that systemd detects that
*some* process in the group was killed by the oom killer and stops the
service.
Can you check which process was actually killed? If it's not the
postmaster, setting OOMScoreAdjust is probably useless.
(I tried searching the web for the error messages and didn't find
anything useful)
> 2) My first thought was to simply have systemd restart postgresql whenever it
> is killed like this, which is easy enough. Then I looked at the default unit
> file, and found these lines:
>
> # prevent OOM killer from choosing the postmaster (individual backends will
> # reset the score to 0)
> OOMScoreAdjust=-900
> # restarting automatically will prevent "pg_ctlcluster ... stop" from working,
> # so we disable it here.
I never call pg_ctlcluster directly, so that probably wouldn't be a good
reason for me.
> Also, the postmaster will restart by itself on most
> # problems anyway, so it is questionable if one wants to enable external
> # automatic restarts.
> #Restart=on-failure
So I'd try this despite the comment.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"