Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions |
Date | |
Msg-id | vxdua2x6jns4cxvsmscjpi2oo5hzkzuo3w6j4mhvzcb5mhywp3@vwuum3qbtvw6 Whole thread Raw |
In response to | Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions (Nazir Bilal Yavuz <byavuz81@gmail.com>) |
Responses |
Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions
|
List | pgsql-hackers |
Hi, On 2025-03-31 19:49:25 +0300, Nazir Bilal Yavuz wrote: > > After this discussion, I think it would be helpful if one of the more > > experienced > > hackers could take a look at the overall picture (perhaps we should set > > the > > status "Ready for Committer"? To be honest, I'm not sure what step that > > status > > should be set at). > > I agree. I will mark this as 'ready for committer' after writing the tests. Are you still working on tests? > > Also, there's a little thing about declaring functions as PARALLEL SAFE. > > To be honest, > > I don't really have any solid arguments against it. I just have some > > doubts. For > > example, how will it work if the plan is split up and we try to work on > > an object in > > one part, while the other part of the plan evicts the pages of that > > object or marks > > them as dirty... I can't really say for sure about that. And in that > > context, I'd > > suggest referring to that awesome statement in the documentation: "If in > > doubt, > > functions should be labeled as UNSAFE, which is the default." > > You may be right. I thought they are parallel safe as we acquire the > buffer header lock for each buffer but now, I have the same doubts as > you. I want to hear other people's opinions on that before labeling > them as UNSAFE. I don't see a problem with them being parallel safe. > From a59be4b50d7691ba952d0abd32198e972d4a6ea0 Mon Sep 17 00:00:00 2001 > From: Nazir Bilal Yavuz <byavuz81@gmail.com> > Date: Mon, 24 Mar 2025 13:52:04 +0300 > Subject: [PATCH v5 1/3] Add pg_buffercache_evict_[relation | all]() functions > for testing > > pg_buffercache_evict_relation(): Evicts all shared buffers in a > relation at once. > pg_buffercache_evict_all(): Evicts all shared buffers at once. > > Both functions provide mechanism to evict multiple shared buffers at > once. They are designed to address the inefficiency of repeatedly calling > pg_buffercache_evict() for each individual buffer, which can be > time-consuming when dealing with large shared buffer pools. > (e.g., ~790ms vs. ~270ms for 16GB of shared buffers). > > These functions are intended for developer testing and debugging > purposes and are available to superusers only. > + if (!superuser()) > + ereport(ERROR, > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > + errmsg("must be superuser to use pg_buffercache_evict_relation function"))); I think it'd be nicer if we didn't ask translators to translate 3 different versions of this message, for different functions. Why not pass in the functio name as a parameter? > + if (RelationUsesLocalBuffers(rel)) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("relation uses local buffers," > + "pg_buffercache_evict_relation function is intended to" > + "be used for shared buffers only"))); We try not to break messages across multiple lines, as that makes searching for them harder. > + EvictRelUnpinnedBuffers(rel, &buffers_evicted, &buffers_flushed); > + > + /* Close relation, release lock. */ This comment imo isn't useful, it just restates the code verbatim. > + relation_close(rel, AccessExclusiveLock); Hm. Why are we dropping the lock here early? It's probably ok, but it's not clear to me why we should do so. > + /* Build and return the tuple. */ Similar to above, the comment is just a restatement of a short piece of code. > + <para> > + The <function>pg_buffercache_evict_relation()</function> function allows all > + shared buffers in the relation to be evicted from the buffer pool given a > + relation identifier. Use of this function is restricted to superusers only. > + </para> I'd say "all unpinned shared buffers". I wonder if these functions ought to also return the number of buffers that could not be evicted? Should this, or the longer comment for the function, mention that buffers for all relation forks are evicted? > + <para> > + The <function>pg_buffercache_evict_all()</function> function allows all > + shared buffers to be evicted in the buffer pool. Use of this function is > + restricted to superusers only. > + </para> Basically the same comments as above, except for the fork portion. > diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c > index f9681d09e1e..5d82e3fa297 100644 > --- a/src/backend/storage/buffer/bufmgr.c > +++ b/src/backend/storage/buffer/bufmgr.c > @@ -6512,26 +6512,47 @@ ResOwnerPrintBufferPin(Datum res) > * even by the same block. This inherent raciness without other interlocking > * makes the function unsuitable for non-testing usage. > * > + * buf_state is used to check if the buffer header lock has been acquired > + * before calling this function. If buf_state has a non-zero value, it means > + * that the buffer header has already been locked. This information is useful > + * for evicting a specific relation's buffers, as the buffers' headers need to > + * be locked before this function can be called with such an intention. I don't like this aspect of the API changes one bit. IMO something callable from outside storage/buffer should have no business knowing about buf_state. And detecting whether already hold a lock by checking whether the buf_state is 0 feels really wrong to me. > +/* > + * Try to evict all the shared buffers containing provided relation's pages. > + * > + * This function is intended for testing/development use only! > + * > + * We need this helper function because of the following reason. > + * ReservePrivateRefCountEntry() needs to be called before acquiring the > + * buffer header lock but ReservePrivateRefCountEntry() is static and it would > + * be better to have it as static. Hence, it can't be called from outside of > + * this file. This helper function is created to bypass that problem. I think there are other reasons - we should minimize the amount of code outside of storage/buffer that needs to know about BuferDescs etc, it's a layering violation to access that outside. > + * Before calling this function, it is important to acquire > + * AccessExclusiveLock on the specified relation to avoid replacing the > + * current block of this relation with another one during execution. What do you mean with "current block of this relation"? > + * buffers_evicted and buffers_flushed are set the total number of buffers > + * evicted and flushed respectively. > + */ > +void > +EvictRelUnpinnedBuffers(Relation rel, int32 *buffers_evicted, int32 *buffers_flushed) > +{ > + bool flushed; > + > + Assert(buffers_evicted && buffers_flushed); > + *buffers_evicted = *buffers_flushed = 0; I'd personally not bother with the Assert, the next line would just crash anyway... > diff --git a/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql b/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql > index 2e255f3fc10..d54bb1fd6f8 100644 > --- a/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql > +++ b/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql > @@ -1,5 +1,13 @@ > \echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.6'" to load this file. \quit > > +DROP FUNCTION pg_buffercache_evict(integer); > +CREATE FUNCTION pg_buffercache_evict( > + IN int, > + OUT evicted boolean, > + OUT flushed boolean) > +AS 'MODULE_PATHNAME', 'pg_buffercache_evict' > +LANGUAGE C PARALLEL SAFE VOLATILE STRICT; I assume the function has to be dropped because the return type changes? Greetings, Andres Freund
pgsql-hackers by date: