Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions - Mailing list pgsql-hackers
From | Nazir Bilal Yavuz |
---|---|
Subject | Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions |
Date | |
Msg-id | CAN55FZ1P1M1Q6vt11qBUW=CCFKP6p8yBnK9BoqJs2m9Tu2YLhw@mail.gmail.com Whole thread Raw |
In response to | Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
Hi, Thanks for the review! On Wed, 2 Apr 2025 at 20:06, Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2025-03-31 19:49:25 +0300, Nazir Bilal Yavuz wrote: > > > After this discussion, I think it would be helpful if one of the more > > > experienced > > > hackers could take a look at the overall picture (perhaps we should set > > > the > > > status "Ready for Committer"? To be honest, I'm not sure what step that > > > status > > > should be set at). > > > > I agree. I will mark this as 'ready for committer' after writing the tests. > > Are you still working on tests? Sorry, I forgot. They are attached as 0004. > > > Also, there's a little thing about declaring functions as PARALLEL SAFE. > > > To be honest, > > > I don't really have any solid arguments against it. I just have some > > > doubts. For > > > example, how will it work if the plan is split up and we try to work on > > > an object in > > > one part, while the other part of the plan evicts the pages of that > > > object or marks > > > them as dirty... I can't really say for sure about that. And in that > > > context, I'd > > > suggest referring to that awesome statement in the documentation: "If in > > > doubt, > > > functions should be labeled as UNSAFE, which is the default." > > > > You may be right. I thought they are parallel safe as we acquire the > > buffer header lock for each buffer but now, I have the same doubts as > > you. I want to hear other people's opinions on that before labeling > > them as UNSAFE. > > I don't see a problem with them being parallel safe. > > > > From a59be4b50d7691ba952d0abd32198e972d4a6ea0 Mon Sep 17 00:00:00 2001 > > From: Nazir Bilal Yavuz <byavuz81@gmail.com> > > Date: Mon, 24 Mar 2025 13:52:04 +0300 > > Subject: [PATCH v5 1/3] Add pg_buffercache_evict_[relation | all]() functions > > for testing > > > > pg_buffercache_evict_relation(): Evicts all shared buffers in a > > relation at once. > > pg_buffercache_evict_all(): Evicts all shared buffers at once. > > > > Both functions provide mechanism to evict multiple shared buffers at > > once. They are designed to address the inefficiency of repeatedly calling > > pg_buffercache_evict() for each individual buffer, which can be > > time-consuming when dealing with large shared buffer pools. > > (e.g., ~790ms vs. ~270ms for 16GB of shared buffers). > > > > These functions are intended for developer testing and debugging > > purposes and are available to superusers only. > > > > + if (!superuser()) > > + ereport(ERROR, > > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > > + errmsg("must be superuser to use pg_buffercache_evict_relation function"))); > > I think it'd be nicer if we didn't ask translators to translate 3 different > versions of this message, for different functions. Why not pass in the functio > name as a parameter? Ah, I didn't think of this aspect, done. > > + if (RelationUsesLocalBuffers(rel)) > > + ereport(ERROR, > > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > > + errmsg("relation uses local buffers," > > + "pg_buffercache_evict_relation function is intended to" > > + "be used for shared buffers only"))); > > We try not to break messages across multiple lines, as that makes searching > for them harder. Done. > > + EvictRelUnpinnedBuffers(rel, &buffers_evicted, &buffers_flushed); > > + > > + /* Close relation, release lock. */ > > This comment imo isn't useful, it just restates the code verbatim. Removed. > > > + relation_close(rel, AccessExclusiveLock); > > Hm. Why are we dropping the lock here early? It's probably ok, but it's not > clear to me why we should do so. We are dropping the lock after we processed the relation. I didn't understand what could be the problem here. Why do you think it is early? > > + /* Build and return the tuple. */ > > Similar to above, the comment is just a restatement of a short piece of code. Done. It exists in other places in the pg_buffercache_pages.c file, I only removed the ones that I added. > > + <para> > > + The <function>pg_buffercache_evict_relation()</function> function allows all > > + shared buffers in the relation to be evicted from the buffer pool given a > > + relation identifier. Use of this function is restricted to superusers only. > > + </para> > > I'd say "all unpinned shared buffers". Done. > I wonder if these functions ought to also return the number of buffers that > could not be evicted? I didn't add this as I thought this information could be gathered if wanted. > Should this, or the longer comment for the function, mention that buffers for > all relation forks are evicted? I added this to a longer comment for the function. > > + <para> > > + The <function>pg_buffercache_evict_all()</function> function allows all > > + shared buffers to be evicted in the buffer pool. Use of this function is > > + restricted to superusers only. > > + </para> > > Basically the same comments as above, except for the fork portion. Done. > > diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c > > index f9681d09e1e..5d82e3fa297 100644 > > --- a/src/backend/storage/buffer/bufmgr.c > > +++ b/src/backend/storage/buffer/bufmgr.c > > @@ -6512,26 +6512,47 @@ ResOwnerPrintBufferPin(Datum res) > > * even by the same block. This inherent raciness without other interlocking > > * makes the function unsuitable for non-testing usage. > > * > > + * buf_state is used to check if the buffer header lock has been acquired > > + * before calling this function. If buf_state has a non-zero value, it means > > + * that the buffer header has already been locked. This information is useful > > + * for evicting a specific relation's buffers, as the buffers' headers need to > > + * be locked before this function can be called with such an intention. > > I don't like this aspect of the API changes one bit. IMO something callable > from outside storage/buffer should have no business knowing about buf_state. > > And detecting whether already hold a lock by checking whether the buf_state is > 0 feels really wrong to me. I changed this. I basically copied EvictUnpinnedBuffer() to the inside of EvictRelUnpinnedBuffers(), we don't need any hacky methods now. > > +/* > > + * Try to evict all the shared buffers containing provided relation's pages. > > + * > > + * This function is intended for testing/development use only! > > + * > > + * We need this helper function because of the following reason. > > + * ReservePrivateRefCountEntry() needs to be called before acquiring the > > + * buffer header lock but ReservePrivateRefCountEntry() is static and it would > > + * be better to have it as static. Hence, it can't be called from outside of > > + * this file. This helper function is created to bypass that problem. > > I think there are other reasons - we should minimize the amount of code > outside of storage/buffer that needs to know about BuferDescs etc, it's a > layering violation to access that outside. I added that as a comment. > > + * Before calling this function, it is important to acquire > > + * AccessExclusiveLock on the specified relation to avoid replacing the > > + * current block of this relation with another one during execution. > > What do you mean with "current block of this relation"? It is used timewise, like a snapshot of the blocks when the function is called. I updated this with: 'The caller must hold at least AccessShareLock on the relation to prevent the relation from being dropped.' [1] > > + * buffers_evicted and buffers_flushed are set the total number of buffers > > + * evicted and flushed respectively. > > + */ > > +void > > +EvictRelUnpinnedBuffers(Relation rel, int32 *buffers_evicted, int32 *buffers_flushed) > > +{ > > + bool flushed; > > + > > + Assert(buffers_evicted && buffers_flushed); > > + *buffers_evicted = *buffers_flushed = 0; > > I'd personally not bother with the Assert, the next line would just crash > anyway... AIO tests already use EvictUnpinnedBuffer() without flushed information. I made '*buffers_evicted, *buffers_flushed in the EvictRelUnpinnedBuffers() and *flushed in the EvictUnpinnedBuffer()' optional as Aidar suggested at upthread. > > diff --git a/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql b/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql > > index 2e255f3fc10..d54bb1fd6f8 100644 > > --- a/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql > > +++ b/contrib/pg_buffercache/pg_buffercache--1.5--1.6.sql > > @@ -1,5 +1,13 @@ > > \echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.6'" to load this file. \quit > > > > +DROP FUNCTION pg_buffercache_evict(integer); > > +CREATE FUNCTION pg_buffercache_evict( > > + IN int, > > + OUT evicted boolean, > > + OUT flushed boolean) > > +AS 'MODULE_PATHNAME', 'pg_buffercache_evict' > > +LANGUAGE C PARALLEL SAFE VOLATILE STRICT; > > I assume the function has to be dropped because the return type changes? Correct. v6 is attached. Additional changes prior to v5: * Tests are added. * Andres said off-list that AccessExclusiveLock is too much, all the lock needs to do is to prevent the relation from being dropped. So, pg_buffercache_evict_relation() opens a relation with AccessShareLock now. Function comment in EvictRelUnpinnedBuffers() is updated regarding that. [1] * EvictRelUnpinnedBuffers() does not call EvictUnpinnedBuffer() now. I basically copied EvictUnpinnedBuffer() to the inside of EvictRelUnpinnedBuffers() with the needed updates. -- Regards, Nazir Bilal Yavuz Microsoft
Attachment
pgsql-hackers by date: