Re: pgsql: Introduce pg_shmem_allocations_numa view - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: pgsql: Introduce pg_shmem_allocations_numa view |
Date | |
Msg-id | b7c96f9b-e347-4900-b861-457140754394@vondra.me Whole thread Raw |
In response to | Re: pgsql: Introduce pg_shmem_allocations_numa view (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
On 6/24/25 13:10, Andres Freund wrote: > Hi, > > On 2025-06-24 03:43:19 +0200, Tomas Vondra wrote: >> FWIW while looking into this, I tried running this under valgrind (on a >> regular 64-bit system, not in the chroot), and I get this report: >> >> ==65065== Invalid read of size 8 >> ==65065== at 0x113B0EBE: pg_buffercache_numa_pages >> (pg_buffercache_pages.c:380) >> ==65065== by 0x6B539D: ExecMakeTableFunctionResult (execSRF.c:234) >> ==65065== by 0x6CEB7E: FunctionNext (nodeFunctionscan.c:94) >> ==65065== by 0x6B6ACA: ExecScanFetch (execScan.h:126) >> ==65065== by 0x6B6B31: ExecScanExtended (execScan.h:170) >> ==65065== by 0x6B6C9D: ExecScan (execScan.c:59) >> ==65065== by 0x6CEF0F: ExecFunctionScan (nodeFunctionscan.c:269) >> ==65065== by 0x6B29FA: ExecProcNodeFirst (execProcnode.c:469) >> ==65065== by 0x6A6F56: ExecProcNode (executor.h:313) >> ==65065== by 0x6A9533: ExecutePlan (execMain.c:1679) >> ==65065== by 0x6A7422: standard_ExecutorRun (execMain.c:367) >> ==65065== by 0x6A7330: ExecutorRun (execMain.c:304) >> ==65065== by 0x934EF0: PortalRunSelect (pquery.c:921) >> ==65065== by 0x934BD8: PortalRun (pquery.c:765) >> ==65065== by 0x92E4CD: exec_simple_query (postgres.c:1273) >> ==65065== by 0x93301E: PostgresMain (postgres.c:4766) >> ==65065== by 0x92A88B: BackendMain (backend_startup.c:124) >> ==65065== by 0x85A7C7: postmaster_child_launch (launch_backend.c:290) >> ==65065== by 0x860111: BackendStartup (postmaster.c:3580) >> ==65065== by 0x85DE6F: ServerLoop (postmaster.c:1702) >> ==65065== Address 0x7b6c000 is in a rw- anonymous segment >> >> >> This fails here (on the pg_numa_touch_mem_if_required call): >> >> for (char *ptr = startptr; ptr < endptr; ptr += os_page_size) >> { >> os_page_ptrs[idx++] = ptr; >> >> /* Only need to touch memory once per backend process */ >> if (firstNumaTouch) >> pg_numa_touch_mem_if_required(touch, ptr); >> } > > That's because we mark unpinned pages as inaccessible / mark them as > accessible when pinning. See logic related to that in PinBuffer(): > > /* > * Assume that we acquired a buffer pin for the purposes of > * Valgrind buffer client checks (even in !result case) to > * keep things simple. Buffers that are unsafe to access are > * not generally guaranteed to be marked undefined or > * non-accessible in any case. > */ > > >> The 0x7b6c000 is the very first pointer, and it's the only pointer that >> triggers this warning. > > I suspect that that's because valgrind combines different reports or such. > Thanks. It probably is something like that, although I made sure to not use any such options when running valgrind (so --error-limit=no). But maybe there's something else, hiding the reports. I guess there are two ways to address this - make sure the buffers are marked as accessible/defined, or add a valgrind suppression. I think the suppression is the right approach here, otherwise we'd need to worry about already pinned buffers etc. Which seems not great, the functions don't even care about buffers right now, they mostly work with memory pages (especially pg_shmem_allocations_numa). Barring objections, I'll fix it this way. regards -- Tomas Vondra
pgsql-hackers by date: