Re: Draft for basic NUMA observability - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Draft for basic NUMA observability |
Date | |
Msg-id | d68f3d67-0d62-449c-8762-18be447629bb@vondra.me Whole thread Raw |
In response to | Re: Draft for basic NUMA observability (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>) |
List | pgsql-hackers |
On 4/4/25 08:50, Bertrand Drouvot wrote: > Hi, > > On Thu, Apr 03, 2025 at 08:53:57PM +0200, Tomas Vondra wrote: >> On 4/3/25 15:12, Jakub Wartak wrote: >>> On Thu, Apr 3, 2025 at 1:52 PM Tomas Vondra <tomas@vondra.me> wrote: >>> >>>> ... >>>> >>>> So unless someone can demonstrate a use case where this would matter, >>>> I'd not worry about it too much. >>> >>> OK, fine for me - just 3 cols for pg_buffercache_numa is fine for me, >>> it's just that I don't have cycles left today and probably lack skills >>> (i've never dealt with arrays so far) thus it would be slow to get it >>> right... but I can pick up anything tomorrow morning. >>> >> >> OK, I took a stab at reworking/simplifying this the way I proposed. >> Here's v24 - needs more polishing, but hopefully enough to show what I >> had in mind. >> >> It does these changes: >> >> 1) Drops 0002 with the pg_buffercache refactoring, because the new view >> is not "extending" the existing one. > > I think that makes sense. One would just need to join on the pg_buffercache > view to get more information about the buffer if needed. > > The pg_buffercache_numa_pages() doc needs an update though as I don't think that > "+ The <function>pg_buffercache_numa_pages()</function> provides the same > information as <function>pg_buffercache_pages()</function>" is still true. > Right, thanks for checking the docs. >> 2) Reworks pg_buffercache_num to return just three columns, bufferid, >> page_num and node_id. page_num is a sequence starting from 0 for each >> buffer. > > +1 on the idea > >> 3) It now builds an array of records, with one record per buffer/page. >> >> 4) I realized we don't really need to worry about buffers_per_page very >> much, except for logging/debugging. There's always "at least one page" >> per buffer, even if an incomplete one, so we can do this: >> >> os_page_count = NBuffers * Max(1, pages_per_buffer); >> >> and then >> >> for (i = 0; i < NBuffers; i++) >> { >> for (j = 0; j < Max(1, pages_per_buffer); j++) > > That's a nice simplification as we always need to take care of at least one page > per buffer. > OK. I think I'll consider moving some of this code "building" the entries into a separate function, to keep the main function easier to understand. >> and everything just works fine, I think. > > I think the same. > >> Opinions? I personally find this much cleaner / easier to understand. > > I agree that's easier to understand and that that looks correct. > > A few random comments: > > === 1 > > It looks like that firstNumaTouch is not set to false anymore. > Damn, my mistake. > === 2 > > + pfree(os_page_status); > + pfree(os_page_ptrs); > > Not sure that's needed, we should be in a short-lived memory context here > (ExprContext or such). > Yeah, maybe. It's not allocated in the multi-call context, but I wasn't sure. Will check. > === 3 > > + ro_volatile_var = *(uint64 *)ptr > > space missing before "ptr"? > Interesting the pgindent didn't tweak this. regards -- Tomas Vondra
pgsql-hackers by date: