Re: Add os_page_num to pg_buffercache - Mailing list pgsql-hackers
From | Bertrand Drouvot |
---|---|
Subject | Re: Add os_page_num to pg_buffercache |
Date | |
Msg-id | aGQOMPEENZc/2fJm@ip-10-97-1-34.eu-west-3.compute.internal Whole thread Raw |
In response to | Re: Add os_page_num to pg_buffercache (Tomas Vondra <tomas@vondra.me>) |
Responses |
Re: Add os_page_num to pg_buffercache
|
List | pgsql-hackers |
Hi, On Tue, Jul 01, 2025 at 04:31:01PM +0200, Tomas Vondra wrote: > On 7/1/25 15:45, Bertrand Drouvot wrote: > > I took a quick look on this, Thanks for looking at it! > and I doubt we want to change the schema of > pg_buffercache like this. Adding columns is fine, but it seems rather > wrong to change the cardinality. The view is meant to be 1:1 mapping for > buffers, but now suddenly it's 1:1 with memory pages. Or rather (buffer, > page), to be precise. > > I think this will break a lot of monitoring queries, and possibly in a > very subtle way - especially on systems with huge pages, where most > buffers will have one row, but then a buffer that happens to be split on > two pages will have two rows. That seems not great. > > IMHO it'd be better to have a new view for this info, something like > pg_buffercache_pages, or something like that. That's a good point, fully agree! > But I'm also starting to question if the patch really is that useful. > Sure, people may not have NUMA support enabled (e.g. on non-linux > platforms), and even if they do the _numa view is quite expensive. > Yeah, it's not for day to day activities, more for configuration testing and also for development activity/testing. For example, If I set BLCKSZ to 8KB and enable huge pages (2MB), then I may expect to see buffers not spread across pages. But what I can see is: SELECT pages_per_buffer, COUNT(*) as buffer_count FROM ( SELECT bufferid, COUNT(*) as pages_per_buffer FROM pg_buffercache GROUP BY bufferid ) subq GROUP BY pages_per_buffer ORDER BY pages_per_buffer; pages_per_buffer | buffer_count ------------------+-------------- 1 | 261120 2 | 1024 This is due to the shared buffers being aligned to PG_IO_ALIGN_SIZE. If I change it to: BufferManagerShmemInit(void) /* Align buffer pool on IO page size boundary. */ BufferBlocks = (char *) - TYPEALIGN(PG_IO_ALIGN_SIZE, + TYPEALIGN(2 * 1024 * 1024, ShmemInitStruct("Buffer Blocks", - NBuffers * (Size) BLCKSZ + PG_IO_ALIGN_SIZE, + NBuffers * (Size) BLCKSZ + (2 * 1024 * 1024), &foundBufs)); Then I get: pages_per_buffer | buffer_count ------------------+-------------- 1 | 262144 (1 row) So we've been able to see that some buffers were spread across pages due to shared buffer alignment on PG_IO_ALIGN_SIZE. And that if we change the alignment to be set to 2MB then I don't see any buffers spread across pages anymore. I think that it helps "visualize" some configuration or code changes. What are your thoughts? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: