Re: Add os_page_num to pg_buffercache - Mailing list pgsql-hackers
| From | Bertrand Drouvot |
|---|---|
| Subject | Re: Add os_page_num to pg_buffercache |
| Date | |
| Msg-id | aR2NqVj9ON8PbOVD@ip-10-97-1-34.eu-west-3.compute.internal Whole thread Raw |
| In response to | Re: Add os_page_num to pg_buffercache (Michael Paquier <michael@paquier.xyz>) |
| List | pgsql-hackers |
Hi, On Tue, Nov 18, 2025 at 02:39:36PM +0900, Michael Paquier wrote: > On Fri, Aug 22, 2025 at 08:48:57AM +0000, Bertrand Drouvot wrote: > > Yeah, I think so. Thanks for the ping, done in attached. > > The patch has been marked as ready for committer, and I see the value > in providing a view where it is possible to know to which OS page a > given shared buffer is linked to, based on the OS page size and our > shared buffer size. No problem with that. Thanks for looking at it! > Now, I am not really a fan of the duplication created here, where most > of the code pg_buffercache_os_pages() is a plain copy-paste of > pg_buffercache_numa_pages(), with the difference being that we want > only os_page_status via a call to pg_numa_query_pages(), Agree. 0001 helps to avoid code duplication but I agree that we could do more. > It seems to me that it would be simpler to make the allocations and > information of os_page_ptrs and os_page_status conditional depending > on the result of pg_numa_init(), and that we could just fill numa_node > with NULLs if numa is not available in the environment where the query > is run. This includes making pg_numa_touch_mem_if_required() > conditional, of course, not called when pg_numa_init() fails. That's a good idea and I think we have 2 options here. Option 1: We could simply create the pg_buffercache_os_pages view on top of the pg_buffercache_numa one. The cons I can think of is that, when numa is available, then pg_buffercache_os_pages would pay the extra cost that also make pg_buffercache_numa slow. Then there is no real benefits for adding a new view, we could just keep pg_buffercache_numa and fill numa_node with NULLs if numa is not available and document also the use case (with an example) when numa is not available. That would achieve the main goal. Option 2: Still make changes in pg_buffercache_numa_pages() and fill with NULL when numa is not available. Then create an helper to do the mapping buffers to OS pages without any NUMA specific operations. That way we could create a dedicated view pg_buffercache_os_pages on top of a new function. No code duplication and the new view would not get the extra cost if numa is available. > Or is > there a strong reason where it would be better to rely on an > elog(ERROR) if numa(3) fails, based on numa_available()? The purpose > of these views being monitoring, it is usually easier in my experience > to rely on NULLs rather than facing periodic errors when we don't know > something. That makes JOINs more predictible, for one. Yeah I don't think that's an issue to fill with NULL instead of generating errors. Specially as it will give added value. > Please note that I don't mind the extra view pg_buffercache_os_pages > that can provide some information that's transparent cross-platform, > but let's make it something that calls pg_buffercache_numa_pages() > instead. So, I think we have Option 1 and Option 2. Option 1 is very simple and Option 2 would allow to get the desired information with no performance penalties when numa is availble. I'm tempted to vote for 1 as I'm not sure the larger code refactoring of option 2 is worth the benefits. Thoughts? > Note 2: Nice catch about the description of os_page_num, applied this > one separately. Thanks! Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: