Re: [Bus error] huge_pages default value (try) not fall back - Mailing list pgsql-bugs

From Odin Ugedal
Subject Re: [Bus error] huge_pages default value (try) not fall back
Date
Msg-id CAFpoUr1ggmGs8qpoKvYxNBO3h-T-n+MNh+JnLRYsYhHurVOiGQ@mail.gmail.com
Whole thread Raw
In response to RE: [Bus error] huge_pages default value (try) not fall back  (Fan Liu <fan.liu@ericsson.com>)
Responses RE: [Bus error] huge_pages default value (try) not fall back  (Fan Liu <fan.liu@ericsson.com>)
List pgsql-bugs
Hi,

I stumbled upon this issue when working with the related issue in
Kubernetes that was referenced a few mails behind. So from what I
understand, it looks like this issue is/may be a result of how hugetlb
cgroup is enforcing the "limit_in_bytes" limit for huge pages. A
process should theoretically don't segfault like this under normal
circumstances when using memory received from a successful mmap.  The
value set to "limit_in_bytes" is only enforced during page allocation,
and _not_ when mapping pages using mmap. This results in a successful
mmap for -n- huge pages as long as the system has -n- free hugepages,
even though the size is bigger than "limit_in_bytes". The process then
reserves the huge page memory, and makes it inaccessible to other
processes.

The real issue is when postgres tries to write to the memory it
received from mmap, and the kernel tries to allocate the reserved huge
page memory. Since it is not allowed to do so by the cgroup, the
process segfaults.

This issue has been fixed in Linux this patch
https://lkml.org/lkml/2020/2/3/1153, that adds a new element of
control to the cgroup that will fix this issue. There are however no
container runtimes that use it yet, and only 5.7+ (afaik.) kernels
support it, but the progress can be tracked here:
https://github.com/opencontainers/runtime-spec/issues/1050. The fix
for the upstream Kubernetes issue
(https://github.com/opencontainers/runtime-spec/issues/1050) that made
kubernetes set wrong value to the top level "limit_in_bytes" when the
pre-allocated page count increased after kubernetes (kubelet) startup,
will hopefully land in Kubernetes 1.19 (or 1.20). Fingers crossed!

Hopefully this makes some sense, and gives some insights into the issue...

Best regards,
Odin Ugedal



pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #16488: psql installation initdb
Next
From: Mukesh Chhatani
Date:
Subject: Postmaster Crashing - Postgres 11 when JIT is enabled