Re: pg_buffercache: Add per-relation summary stats - Mailing list pgsql-hackers
| From | Ashutosh Bapat |
|---|---|
| Subject | Re: pg_buffercache: Add per-relation summary stats |
| Date | |
| Msg-id | CAExHW5to8LTCfo+zSqMV06JUnY8RfDSm9qjUP0w55voJE4tLiA@mail.gmail.com Whole thread |
| In response to | Re: pg_buffercache: Add per-relation summary stats (Heikki Linnakangas <hlinnaka@iki.fi>) |
| List | pgsql-hackers |
On Tue, Apr 7, 2026 at 6:37 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > On 28/03/2026 06:18, Ashutosh Bapat wrote: > > Parallely myself and Palak Chaturvedi developed a quick patch to > > modernise pg_buffercache_pages() and use tuplestore so that it doesn't > > have to rely on NBuffers being the same between start of the scan, > > when memory allocated, when the scan ends - a condition possible with > > resizing buffer cache. It seems to improve the timings by about 10-30% > > on my laptop for 128MB buffercache size. Without this patch the time > > taken to execute Lukas's query varies between 10-15ms on my laptop. > > With this patch it varies between 8-9ms. So the timing is more stable > > as a side effect. It's not a 10x improvement that we are looking for > > but it looks like a step in the right direction. That improvement > > seems to come purely because we avoid creating a heap tuple. I wonder > > if there are some places up in the execution tree where full > > heaptuples get formed again instead of continuing to use minimal > > tuples or places where we perform some extra actions that are not > > required. > > > > I didn't dig into the history to find out why we didn't modernize > > pg_buffercache_pages(). I don't see any hazard though. > > Committed this modernization patch, thanks! > > It would be nice to have a proper row-at-a-time mode that would avoid > materializing the result, but collecting all the data in a temporary > array is clearly worse than just putting them to the tuplestore > directly. The only reason I can think of why we'd prefer to use a > temporary array like that is to get a more consistent snapshot of all > the buffers, by keeping the time spent scanning the buffers as short as > possible. But we're not getting a consistent view anyway, it's just a > matter of degree. > Thanks a lot. Makes code in buffer resizing a bit simpler esp. code changes in this module. Probably it won't need any code changes now in the buffer resizing patches. > I wondered about this in pg_buffercache_pages.c: > > > /* > > * To smoothly support upgrades from version 1.0 of this extension > > * transparently handle the (non-)existence of the pinning_backends > > * column. We unfortunately have to get the result type for that... - we > > * can't use the result type determined by the function definition without > > * potentially crashing when somebody uses the old (or even wrong) > > * function definition though. > > */ > > if (get_call_result_type(fcinfo, NULL, &expected_tupledesc) != TYPEFUNC_COMPOSITE) > > elog(ERROR, "return type must be a row type"); > > > > if (expected_tupledesc->natts < NUM_BUFFERCACHE_PAGES_MIN_ELEM || > > expected_tupledesc->natts > NUM_BUFFERCACHE_PAGES_ELEM) > > elog(ERROR, "incorrect number of output arguments"); > > I guess it's still needed, if you have pg_upgraded all the way from 1.0. > To test that, I created this view to match the old 1.0 definition: > > CREATE VIEW public.legacy_pg_buffercache AS > SELECT bufferid, > relfilenode, > reltablespace, > reldatabase, > relforknumber, > relblocknumber, > isdirty, > usagecount > FROM public.pg_buffercache_pages() p(bufferid integer, relfilenode > oid, reltablespace oid, reldatabase oid, relforknumber smallint, > relblocknumber bigint, isdirty boolean, usagecount smallint); > > "select * from public.legacy_pg_buffercache" still works, so all good. Yeah. At some point we should get rid of this code, but it would require some "upgrade" action from the customer as well. -- Best Wishes, Ashutosh Bapat
pgsql-hackers by date: