Home > mailing lists

Re: Summary function for pg_buffercache - Mailing list pgsql-hackers

From	Melih Mutlu
Subject	Re: Summary function for pg_buffercache
Date	September 20, 2022 08:47:40
Msg-id	CAGPVpCRtjkm9jVAq6ND3NPTBAvs6mLYUu9fTm6ZgNh0FXNBU=g@mail.gmail.com Whole thread
In response to	Re: Summary function for pg_buffercache (Melih Mutlu <m.melihmutlu@gmail.com>)
Responses	Re: Summary function for pg_buffercache
List	pgsql-hackers

Tree view

Hi,

Also I suggest changing the names of the columns in order to make them consistent with the rest of the system. If you consider pg_stat_activity and family [1] you will notice that the columns are named (entity)_(property), e.g. backend_xid, backend_type, client_addr, etc. So instead of used_buffers and unused_buffers the naming should be buffers_used and buffers_unused.

[1]: https://www.postgresql.org/docs/current/monitoring-stats.html

I changed these names and updated the patch.

However I have somewhat mixed feelings about avg_usagecount. Generally AVG() is a relatively useless methric for monitoring. What if the user wants MIN(), MAX() or let's say a 99th percentile? I suggest splitting it into usagecount_min, usagecount_max and usagecount_sum. AVG() can be derived as usercount_sum / used_buffers.

Won't be usagecount_max almost always 5 as "BM_MAX_USAGE_COUNT" set to 5 in buf_internals.h? I'm not sure about how much usagecount_min would add either.
A usagecount is always an integer between 0 and 5, it's not something unbounded. I think the 99th percentile would be much better than average if strong outlier values could occur. But in this case, I feel like an average value would be sufficiently useful as well.
usagecount_sum would actually be useful since average can be derived from it. If you think that the sum of usagecounts has a meaning just by itself, it makes sense to include it. Otherwise, wouldn't showing directly averaged value be more useful?

Aleksander, do you still think the average usagecount is a bit useless? Or does it make sense to you to keep it like this?

> I suggest we focus on saving the memory first and then think about the
> performance, if necessary.

Personally I think the locks part is at least as important - it's what makes
the production impact higher.

I agree that it's important due to its high impact. I'm not sure how to avoid any undefined behaviour without locks though.

Even with locks, performance is much better. But is it good enough for production?

Thanks,

Melih

Attachment

v6-0001-Added-pg_buffercache_summary-function.patch

pgsql-hackers by date:

From: Aleksander Alekseev
Date: 20 September 2022, 08:35:58
Subject: Re: Add common function ReplicationOriginName.

From: Richard Guo
Date: 20 September 2022, 08:55:11
Subject: Re: About displaying NestLoopParam

Re: Summary function for pg_buffercache - Mailing list pgsql-hackers

Attachment

Previous

Next