Re: /proc/self/oom_adj is deprecated in newer Linux kernels - Mailing list pgsql-hackers

From Gurjeet Singh
Subject Re: /proc/self/oom_adj is deprecated in newer Linux kernels
Date
Msg-id CABwTF4W8YBEfgHm64CcZLvYHBj7yRYKM9ZcThW2aNsdDPeKixw@mail.gmail.com
Whole thread Raw
In response to Re: /proc/self/oom_adj is deprecated in newer Linux kernels  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: /proc/self/oom_adj is deprecated in newer Linux kernels  (David G Johnston <david.g.johnston@gmail.com>)
List pgsql-hackers
On Tue, Jun 10, 2014 at 11:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> In my view, the root-owned startup script grants OOM exemption to
> the postmaster because it *knows* that the postmaster's children
> will drop the exemption.  If that trust can be violated because some
> clueless DBA decided to frob a GUC setting, I'd be a lot more hesitant
> about giving the postmaster the exemption in the first place.

Even if the clueless DBA tries to set the oom_score_adj below that of
Postmaster, Linux kernel prevents that from being done. I demonstrate
that in the below screen share. I used GUC as well as plain command
line to try and set a child's badness (oom_score_adj) to be lower than
that of Postmaster's, and Linux disallows doing that, unless I use
root's powers.

So the argument that this GUC is a security concern, can be ignored.
Root user (one with control of start script) still controls the lowest
badness setting of all Postgres processes. If done at fork_process
time, the child process simply inherits parent's badness setting.

Best regards,

# Set postmaster badness: -1000
$ sudo echo -1000 > /proc/$$/oom_score_adj ; cat /proc/self/oom_score_adj
-1000

# Set backend badness the same as postmaster: -1000
$ perl -pi -e 's/linux_oom_score_adj = .*/linux_oom_score_adj =
-1000/' $PGDATA/postgresql.conf ; grep oom $PGDATA/postgresql.conf
linux_oom_score_adj = -1000

$ pgstop; pgstart
pg_ctl: server is running (PID: 65355)
/home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres "-D"
"/home/gurjeet/dev/pgdbuilds/oom_guc/db/data"
waiting for server to shut down.... done
server stopped
pg_ctl: no server running
waiting for server to start.... done
server started
Database cluster state:               in production

# Backends and the postmaster have the same badness; lower than default 0
$ for p in $(echo $(pgserverPIDList)| cut -d , --output-delimiter=' '
-f 1-); do echo $p $(cat /proc/$p/oom_score_adj) $(cat
/proc/$p/cmdline); done
537 -1000 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
539 -1000 postgres: checkpointer process
540 -1000 postgres: writer process
541 -1000 postgres: wal writer process
542 -1000 postgres: autovacuum launcher process
543 -1000 postgres: stats collector process

# Set postmaster badness: -100
$ sudo echo -100 > /proc/$$/oom_score_adj ; cat /proc/self/oom_score_adj
-100

# Set backend badness the lower than postmaster: -500 vs. -100
$ perl -pi -e 's/linux_oom_score_adj = .*/linux_oom_score_adj = -500...
linux_oom_score_adj = -500

$ pgstop; pgstart
...

# Backends are capped to parent's badness.
$ for p in $(echo $(pgserverPIDList)| cut -d , ...
716 -100 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
718 -100 postgres: checkpointer process
719 -100 postgres: writer process
720 -100 postgres: wal writer process
721 -100 postgres: autovacuum launcher process
722 -100 postgres: stats collector process

# Set postmaster badness: -100
$ sudo echo -100 > /proc/$$/oom_score_adj ; cat /proc/self/oom_score_adj
-100

# Set backend badness the higher than postmaster: +500 vs. -100
$ perl -pi -e 's/linux_oom_score_adj = .*/linux_oom_score_adj = 500/'
$PGDATA/postgresql.conf ; grep oom $PGDATA/postgresql.conf
linux_oom_score_adj = 500

$ pgstop; pgstart
...

# Backends have higher badness, hence higher likelyhood to be killed
than the Postmaster.
$ for p in $(echo $(pgserverPIDList)| cut -d , ...
1602 -100 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
1604 500 postgres: checkpointer process
1605 500 postgres: writer process
1606 500 postgres: wal writer process
1607 500 postgres: autovacuum launcher process
1608 500 postgres: stats collector process

# Set postmaster badness: -100
$ sudo echo -100 > /proc/$$/oom_score_adj ; cat /proc/self/oom_score_adj
-100

# Reset backend badness to default: 0
$ perl -pi -e 's/linux_oom_score_adj = .*/linux_oom_score_adj = 0/'
$PGDATA/postgresql.conf ; grep oom $PGDATA/postgresql.conf
linux_oom_score_adj = 0

$ pgstop; pgstart
...

# Backends have higher badness, hence higher likelyhood to be killed
than the Postmaster.
$ for p in $(echo $(pgserverPIDList)| cut -d , ...
1835 -100 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
1837 0 postgres: checkpointer process
1838 0 postgres: writer process
1839 0 postgres: wal writer process
1840 0 postgres: autovacuum launcher process
1841 0 postgres: stats collector process

# Lower checkpointer's badness, but keep it higher than postmaster's
$ echo -1 > /proc/1837/oom_score_adj

$ for p in $(echo $(pgserverPIDList)| cut -d , ...
1835 -100 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
1837 -1 postgres: checkpointer process
...

# Lower checkpointer's badness, but keep it same as postmaster's
$ echo -100 > /proc/1837/oom_score_adj

$ for p in $(echo $(pgserverPIDList)| cut -d , ...
1835 -100 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
1837 -100 postgres: checkpointer process
...

# Lower checkpointer's badness, and try to lower it below postmaster's
$ echo -101 > /proc/1837/oom_score_adj
bash: echo: write error: Permission denied

$ for p in $(echo $(pgserverPIDList)| cut -d , ...
1835 -100 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
1837 -100 postgres: checkpointer process
...

# As root, force child process' badness to be lower than postnaster's
$ sudo echo -101 > /proc/1837/oom_score_adj
[sudo] password for gurjeet:

$ for p in $(echo $(pgserverPIDList)| cut -d , ...
1835 -100 /home/gurjeet/dev/pgdbuilds/oom_guc/db/bin/postgres-D/home/gurjeet/dev/pgdbuilds/oom_guc/db/data
1837 -101 postgres: checkpointer process
...

-- 
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: /proc/self/oom_adj is deprecated in newer Linux kernels
Next
From: Tom Lane
Date:
Subject: Re: /proc/self/oom_adj is deprecated in newer Linux kernels