Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Date
Msg-id CAAKRu_YLt1CtMPCzYaZX+Qni_ScGp58+o1+5DEwxRUWz8GD8iQ@mail.gmail.com
Whole thread Raw
In response to Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Andres Freund <andres@anarazel.de>)
Responses Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Melanie Plageman <melanieplageman@gmail.com>)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Thu, Oct 20, 2022 at 1:31 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> - we shouldn't do pgstat_count_io_op() while the buffer header lock is held,
>   if possible.

I've changed this locally. It will be fixed in the next version I share.

>
>   I wonder if we should add a "source" output argument to
>   StrategyGetBuffer(). Then nearly all the counting can happen in
>   BufferAlloc().

I think we can just check for BM_VALID being set before invalidating it
in order to claim the buffer at the end of BufferAlloc(). Then we can
count it as an eviction or reuse.

>
> - "repossession" is a very unintuitive name for me. If we want something like
>   it, can't we just name it reuse_failed or such?

Repossession could be called eviction_failed or reuse_failed.
Do we think we will ever want to use it to count buffers we released
in other IOContexts (thus making the name eviction_failed better than
reuse_failed)?

> - Is it actually correct to count evictions in StrategyGetBuffer()? What if we
>   then decide to not use that buffer in BufferAlloc()? Yes, that'll be counted
>   via rejected, but that still leaves the eviction count to be "misleading"?

I agree that counting evictions in StrategyGetBuffer() is incorrect.
Checking BM_VALID at bottom of BufferAlloc() should be better.

> On 2022-10-19 15:26:51 -0400, Melanie Plageman wrote:
> > I have made some major changes in this area to make the columns more
> > useful. I have renamed and split "clocksweeps". It is now "evicted" and
> > "freelist acquired". This makes it clear when a block must be evicted
> > from a shared buffer must be and may help to identify misconfiguration
> > of shared buffers.
>
> I'm not sure freelist acquired is really that useful? If we don't add it, we
> should however definitely not count buffers from the freelist as evictions.
>
>
> > There is some nuance here that I tried to make clear in the docs.
> > "freelist acquired" in a shared context is straightforward.
> > "freelist acquired" in a strategy context is counted when a shared
> > buffer is added to the strategy ring (not when it is reused).
>
> Not sure what the second half here means - why would a buffer that's not from
> the freelist ever be counted as being from the freelist?
>
>
> > "freelist_acquired" is confusing for local buffers but I wanted to
> > distinguish between reuse/eviction of local buffers and initial
> > allocation. "freelist_acquired" seemed more fitting because there is a
> > clocksweep to find a local buffer and if it hasn't been allocated yet it
> > is allocated in a place similar to where shared buffers acquire a buffer
> > from the freelist. If I didn't count it here, I would need to make a new
> > column only for local buffers called "allocated" or something like that.
>
> I think you're making this too granular. We need to have more detail than
> today. But we don't necessarily need to catch every nuance.
>

I am fine with cutting freelist_acquired. The same actionable
information that it could provide could be provided by "read", right?
Also, removing it means I can remove the complicated explanation of how
freelist_acquired should be interpreted in IOCONTEXT_LOCAL.

Speaking of IOCONTEXT_LOCAL, I was wondering if it is confusing to call
it IOCONTEXT_LOCAL since it refers to IO done for temporary tables. What
if, in the future, we want to track other IO done using data in local
memory? Also, what if we want to track other IO done using data from
shared memory that is not in shared buffers? Would IOCONTEXT_SB and
IOCONTEXT_TEMP be better? Should IOContext literally describe the
context of the IO being done and there be a separate column which
indicates the source of the data for the IO?
Like wal_buffer, local_buffer, shared_buffer? Then if it is not
block-oriented, it could be shared_mem, local_mem, or bypass?

If we had another dimension to the matrix "data_src" which, with
block-oriented IO is equivalent to "buffer type", this could help with
some of the clarity problems.

We could remove the "reused" column and that becomes:

IOCONTEXT |     DATA_SRC    | IOOP
----------------------------------------
strategy  | strategy_buffer | EVICT

Having data_src and iocontext simplifies the meaning of all io
operations involving a strategy. Some operations are done on shared
buffers and some on existing strategy buffers and this would be more
clear without the addition of special columns for strategies.


> > I have also added the columns "repossessed" and "rejected". "rejected"
> > is when a bulkread rejects a strategy buffer because it is dirty and
> > requires flush. Seeing a lot of rejections could indicate you need to
> > vacuum. "repossessed" is the number of times a strategy buffer was
> > pinned or in use by another backend and had to be removed from the
> > strategy ring and replaced with a new shared buffer. This gives you some
> > indication that there is contention on blocks recently used by a
> > strategy.
>
> I don't immediately see a real use case for repossessed. Why isn't it
> sufficient to count it as part of rejected?

I'm still on the fence about combining rejection and reuse_failed. A
buffer rejected by a bulkread for being dirty may indicate the need to
vacuum but doesn't say anything about contention.
Whereas, failed reuses indicate contention for the blocks operated on by
the strategy. You would react to them differently. And you could have a
bulkread racking up both failed reuses and rejections.

If this seems like an unlikely or niche case, I would be okay with
combining rejections with reuse_failed. But it would be nice if we could
help with interpreting the column. I wonder if there is a rule of thumb
for determining which scenario you have. For example, how likely is it
that if you see a high number of reuse_rejected in a bulkread IOContext
that you would see any reused if the rejections are due to the bulkread
dirtying its own buffers? I suppose it would depend on your workload and
how random your updates/deletes were? If there is some way to use
reuse_rejected in combination with another column to determine the cause
of the rejections, it would be easier to combine them.

- Melanie



pgsql-hackers by date:

Previous
From: David Christensen
Date:
Subject: [PATCHES] Post-special page storage TDE support
Next
From: Nikita Malakhov
Date:
Subject: Re: Pluggable toaster