Auto-tune shared_buffers to use available huge pages - Mailing list pgsql-hackers

From Anthonin Bonnefoy
Subject Auto-tune shared_buffers to use available huge pages
Date
Msg-id CAO6_Xqq6w5hTY_W+gJWp29t15NRtNLSTzD6khDC=Xy2P0BWPTQ@mail.gmail.com
Whole thread Raw
Responses Re: Auto-tune shared_buffers to use available huge pages
List pgsql-hackers
Hi,

Under a normal environment, the instance's number of huge pages can be
adjusted to the size reported by shared_memory_size_in_huge_pages,
then Postgres can be started and the requested shared memory fit in
the available huge pages.

A similar approach is harder to implement with environments like
kubernetes. If I want to modify the huge pages on a pod, I need to:
- Modify the host's huge pages
- Restart the host's kubelet so it detects the new amount of huge pages
- Modify the pod's huge page request

Most of those steps are far from practical. An alternative would be to
have a fixed number of huge pages (like 25% of the node's memory), and
to adjust the configuration, like the amount of shared_buffers.
However, adjusting the configuration to fit in a fixed amount of
memory is tricky:
- shared_buffers is used to auto-tune multiple parameters so there's
no easy formula to get the correct amount. The only way I've found is
to basically increase shared_buffers until
shared_memory_size_in_huge_pages matches the desired amount of huge
pages
- changing other parameters like max_connections mean shared_buffers
has to be adjusted again

To help with that, the attached patch provides a new option,
huge_pages_autotune_buffers, to automatically use leftover huge pages
as shared_buffers. This requires some changes in the auto-tune logic:
- Subsystems that are using shared_buffers for auto-tuning will rely
on the configured shared_buffers, not the auto-tuned shared_buffers
and they should save the auto-tuned value in a GUC. This will be done
in dedicated auto-tune functions.
- Once the auto-tune functions are called, modifying NBuffers won't
change the requested memory except for the shared buffer pool in
BufferManagerShmemSize
- We can get the leftover memory (free huge pages - requested memory),
and estimate how much shared_buffers we can add
- Increasing shared_buffers will also increase the freelist hashmap,
so the auto-tuned shared_buffers needs to be reduced

The patch is split in the following sub-patches:

0001: Extract the current auto-tune logic in dedicated functions,
making the behaviour more consistent across subsystems.

0002: The checkpointer auto-tunes the request size using NBuffers, but
doesn't save the result in a GUC. This adds a new
checkpoint_request_size GUC with the same auto-tune logic.

0003: Extract HugePages_Free value when /proc/meminfo is parsed in
GetHugePageSize.

0004: Pass NBuffers as parameters to StrategyShmemSize. This is
necessary to get how much memory will be used by the freelist using
'StrategyShmemSize(candidate_nbuffers) - StrategyShmemSize(NBuffers)'.

0005: Add BufferManagerAutotune to auto-tune the amount of shared_buffers.

Regards,
Anthonin Bonnefoy

Attachment

pgsql-hackers by date:

Previous
From: David Geier
Date:
Subject: Re: Use correct collation in pg_trgm
Next
From: Ashutosh Bapat
Date:
Subject: Re: Import Statistics in postgres_fdw before resorting to sampling.