Re: Adding basic NUMA awareness - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Adding basic NUMA awareness
Date
Msg-id fqnplzfzkxfdgrfrpg2ckaz22jyidahqczhk36ptf2stdzhafw@lid5xloaylxr
Whole thread Raw
In response to Re: Adding basic NUMA awareness  (Jakub Wartak <jakub.wartak@enterprisedb.com>)
Responses Re: Adding basic NUMA awareness
List pgsql-hackers
Hi,

On 2025-07-04 13:05:05 +0200, Jakub Wartak wrote:
> On Tue, Jul 1, 2025 at 9:07 PM Tomas Vondra <tomas@vondra.me> wrote:
> > I don't think the splitting would actually make some things simpler, or
> > maybe more flexible - in particular, it'd allow us to enable huge pages
> > only for some regions (like shared buffers), and keep the small pages
> > e.g. for PGPROC. So that'd be good.
> 
> You have made assumption that this is good, but small pages (4KB) are
> not hugetlb, and are *swappable* (Transparent HP are swappable too,
> manually allocated ones as with mmap(MMAP_HUGETLB) are not)[1]. The
> most frequent problem I see these days are OOMs, and it makes me
> believe that making certain critical parts of shared memory being
> swappable just to make pagesize granular is possibly throwing the baby
> out with the bathwater. I'm thinking about bad situations like: some
> wrong settings of vm.swapiness that people keep (or distros keep?) and
> general inability of PG to restrain from allocating more memory in
> some cases.

The reason it would be advantageous to put something like the procarray onto
smaller pages is that otherwise the entire procarray (unless particularly
large) ends up on a single NUMA node, increasing the latency for backends on
every other numa node and increasing memory traffic on that node.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Cédric Villemain
Date:
Subject: Re: Adding basic NUMA awareness - Preliminary feedback and outline for an extensible approach
Next
From: Andres Freund
Date:
Subject: Re: postmaster uses more CPU in 18 beta1 with io_method=io_uring