Re: Draft for basic NUMA observability - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Draft for basic NUMA observability |
Date | |
Msg-id | c8d8c6d6-fb8a-4303-a471-f4a54a023328@vondra.me Whole thread Raw |
In response to | Re: Draft for basic NUMA observability (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>) |
Responses |
Re: Draft for basic NUMA observability
|
List | pgsql-hackers |
On 4/3/25 10:23, Bertrand Drouvot wrote: > Hi, > > On Thu, Apr 03, 2025 at 09:01:43AM +0200, Jakub Wartak wrote: >> On Wed, Apr 2, 2025 at 6:40 PM Tomas Vondra <tomas@vondra.me> wrote: >> >> Hi Tomas, >> >>> OK, so you agree the commit messages are complete / correct? >> >> Yes. > > Not 100% sure on my side. > > === v21-0002 > > Says: > > " > This introduces three new functions: > > - pg_buffercache_init_entries > - pg_buffercache_build_tuple > - get_buffercache_tuple > " > > While pg_buffercache_build_tuple() is not added (pg_buffercache_save_tuple() > is). > Ah, OK. Jakub, can you correct (and double-check) this in the next version of the patch? > About v21-0002: > > === 1 > > I can see that the pg_buffercache_init_entries() helper comments are added in > v21-0003 but I think that it would be better to add them in v21-0002 > (where the helper is actually created). > +1 to that > About v21-0003: > > === 2 > >> I hear you, attached v21 / 0003 is free of float/double arithmetics >> and uses non-float point values. > > + if (buffers_per_page > 1) > + os_page_query_count = NBuffers; > + else > + os_page_query_count = NBuffers * pages_per_buffer; > > yeah, that's more elegant. I think that it properly handles the relationships > between buffer and page sizes without relying on float arithmetic. > In the review I just submitted, I changed this to os_page_query_count = Max(NBuffers, NBuffers * pages_per_buffer); but maybe it's less clear. Feel free to undo my change. > === 3 > > + if (buffers_per_page > 1) > + { > > As buffers_per_page does not change, I think I'd put this check outside of the > for (i = 0; i < NBuffers; i++) loop, something like: > > " > if (buffers_per_page > 1) { > /* BLCKSZ < PAGESIZE: one page hosts many Buffers */ > for (i = 0; i < NBuffers; i++) { > . > . > . > . > } else { > /* BLCKSZ >= PAGESIZE: Buffer occupies more than one OS page */ > for (i = 0; i < NBuffers; i++) { > . > . I don't know. It's a matter of opinion, but I find the current code fairly understandable. Maybe if it meaningfully reduced the code nesting, but even with the extra branch we'd still need the for loop. I'm not against doing this differently, but I'd have to see how that looks. Until then I think it's fine to have the code as is. > . > " > > === 4 > >> That _numa_prepare_ptrs() is unused and will need to be removed, >> but we can still move some code there if necessary. > > Yeah I think that it can be simply removed then. > I'm not particularly attached on having the _ptrs() function, but why couldn't it build the os_page_ptrs array as before? > === 5 > >> @Bertrand: do you have anything against pg_shm_allocations_numa >> instead of pg_shm_numa_allocations? I don't mind changing it... > > I think that pg_shm_allocations_numa() is better (given the examples you just > shared). > > === 6 > >> I find all of those non-user friendly and I'm afraid I won't be able >> to pull that alone in time... > > Maybe we could add a few words in the doc to mention the "multiple nodes per > buffer" case? And try to improve it for say 19? Also maybe we should just focus > till v21-0003 (and discard v21-0004 for 18). > IMHO it's not enough to paper this over by mentioning it in the docs. I'm a bit confused about which patch you suggest to leave out. I think 0004 is pg_shmem_allocations_numa, but I think that part is fine, no? We've been discussing how to represent nodes for buffers in 0003. I understand it'd require a bit of coding, but AFAICS adding an array would be fairly trivial amount of code. Something like values[i++] = PointerGetDatum(construct_array_builtin(nodes, nodecnt, INT4OID)); would do, I think. But if we decide to do the 3-column view, that'd be even simpler, I think. AFAICS it means we could mostly ditch the pg_buffercache refactoring in patch 0002, and 0003 would get much simpler too I think. If Jakub doesn't have time to work on this, I can take a stab at it later today. regards -- Tomas Vondra
pgsql-hackers by date: