Re: pgsql: Introduce pg_shmem_allocations_numa view - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: pgsql: Introduce pg_shmem_allocations_numa view |
Date | |
Msg-id | d0949d7e-dcf2-4650-8a6e-027eb9e17837@vondra.me Whole thread Raw |
In response to | Re: pgsql: Introduce pg_shmem_allocations_numa view (Jakub Wartak <jakub.wartak@enterprisedb.com>) |
List | pgsql-hackers |
On 6/25/25 09:15, Jakub Wartak wrote: > On Tue, Jun 24, 2025 at 5:30 PM Christoph Berg <myon@debian.org> wrote: >> >> Re: Tomas Vondra >>> If it's a reliable fix, then I guess we can do it like this. But won't >>> that be a performance penalty on everyone? Or does the system split the >>> array into 16-element chunks anyway, so this makes no difference? >> >> There's still the overhead of the syscall itself. But no idea how >> costly it is to have this 16-step loop in user or kernel space. >> >> We could claim that on 32-bit systems, shared_buffers would be smaller >> anyway, so there the overhead isn't that big. And the step size should >> be larger (if at all) on 64-bit. >> >>> Anyway, maybe we should start by reporting this to the kernel people. Do >>> you want me to do that, or shall one of you take care of that? I suppose >>> that'd be better, as you already wrote a fix / know the code better. >> >> Submitted: https://marc.info/?l=linux-mm&m=175077821909222&w=2 >> > > Hi all, I'm quite late to the party (just noticed the thread), but > here's some addition context: it technically didn't make any sense to > me to have NUMA on 32-bit due too small amount of addressable memory > (after all, NUMA is about big iron, probably not even VMs), so in the > first versions of the patchset I've excluded 32-bit (and back then for > some reason I couldn't even find libnuma i386, but Andres pointed to > me that it exists, so we re-added it probably just to stay > consistent). The thread has kind of snowballed since then, but I still > believe that NUMA on 32-bit does not make a lot of sense. > > Even assuming future shm interleaving one day in future version, > allocation of small s_b sizes will usually fit a single NUMA node. > Not sure. I thought NUMA doesn't matter very much on 32-bit systems too, exactly because those systems tend to use small amounts of memory. But then while investigating this issue I realized even rpi5 has NUMA, in fact it has a whopping 8 nodes: debian@raspberry-32:~ $ numactl --hardware available: 8 nodes (0-7) node 0 cpus: 0 1 2 3 node 0 size: 981 MB node 0 free: 882 MB node 1 cpus: 0 1 2 3 node 1 size: 1007 MB node 1 free: 936 MB node 2 cpus: 0 1 2 3 node 2 size: 1007 MB node 2 free: 936 MB node 3 cpus: 0 1 2 3 node 3 size: 943 MB node 3 free: 873 MB node 4 cpus: 0 1 2 3 node 4 size: 1007 MB node 4 free: 936 MB node 5 cpus: 0 1 2 3 node 5 size: 1007 MB node 5 free: 935 MB node 6 cpus: 0 1 2 3 node 6 size: 1007 MB node 6 free: 936 MB node 7 cpus: 0 1 2 3 node 7 size: 990 MB node 7 free: 918 MB node distances: node 0 1 2 3 4 5 6 7 0: 10 10 10 10 10 10 10 10 1: 10 10 10 10 10 10 10 10 2: 10 10 10 10 10 10 10 10 3: 10 10 10 10 10 10 10 10 4: 10 10 10 10 10 10 10 10 5: 10 10 10 10 10 10 10 10 6: 10 10 10 10 10 10 10 10 7: 10 10 10 10 10 10 10 10 This is with the 32-bit system (which AFAICS means 64-bit kernel and 32-bit user space). I'm not saying it's a particularly interesting NUMA system, considering all the costs are 10, and it's not like it's critical to get the best performance on rpi5. But it's NUMA, and maybe there are some other (more practical) systems. I find it interesting mostly for testing purposes. regards -- Tomas Vondra
pgsql-hackers by date: