Re: To Tomas Vondra
> This is acting up on Debian's 32-bit architectures, namely i386, armel
> and armhf:
... and x32 (x86_64 instruction set with 32-bit pointers).
> SELECT COUNT(*) >= 0 AS ok FROM pg_shmem_allocations_numa;
> +ERROR: invalid NUMA node id outside of allowed range [0, 0]: -14
>
> -14 seems to be -EFAULT, and move_pages(2) says:
> -EFAULT
> This is a zero page or the memory area is not mapped by the process.
I did some debugging on i386 and made it print the page numbers:
SELECT COUNT(*) >= 0 AS ok FROM pg_shmem_allocations_numa;
+WARNING: invalid NUMA node id outside of allowed range [0, 0]: -14 for page 35
+WARNING: invalid NUMA node id outside of allowed range [0, 0]: -14 for page 36
...
+WARNING: invalid NUMA node id outside of allowed range [0, 0]: -14 for page 32768
+WARNING: invalid NUMA node id outside of allowed range [0, 0]: -14 for page 32769
So it works for the first few pages and then the rest is EFAULT.
I think the pg_numa_touch_mem_if_required() hack might not be enough
to force the pages to be allocated. Changing that to a memcpy() didn't
help. Is there some optimization that zero pages aren't allocated
until being written to?
Why do we try to force the pages to be allocated at all? This is just
a monitoring function, it should not change the actual system state.
Why not just skip any page where the status is <0 ?
The attached patch removes that logic. Regression tests pass, but we
probably have to think about whether to report these negative numbers
as-is or perhaps convert them to NULL.
Christoph