Here's three small patches, that should handle the issue
0001 - Adds the batching into pg_numa_query_pages, so that the callers
don't need to do anything.
The batching doesn't seem to cause any performance regression. 32-bit
systems can't use that much memory anyway, and on 64-bit systems the
batch is sufficiently large (1024).
0002 - Silences the valgrind about the memory touching. It replaces the
macro with a static inline function, and adds suppressions for both
32-bit and 64-bits. The 32-bit may be a bit pointless, because on my
rpi5 valgrind produces about a bunch of other stuff anyway. But doesn't
hurt.
The function now looks like this:
static inline void
pg_numa_touch_mem_if_required(void *ptr)
{
volatile uint64 touch pg_attribute_unused();
touch = *(volatile uint64 *) ptr;
}
I did a lot of testing on multiple systems to check replacing the macro
with a static inline function still works - and it seems it does. But if
someone thinks the function won't work, I'd like to know.
0003 - While working on these patches, it occurred to me we could/should
add CHECK_FOR_INTERRUPTS() into the batch loop. This querying can take
quite a bit of time, so letting people to interrupt it seems reasonable.
It wasn't possible with just one call into the kernel, but with the
batching we can add a CFI.
Please, take a look.
regards
--
Tomas Vondra