Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) - Mailing list pgsql-hackers
From | Maciek Sakrejda |
---|---|
Subject | Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) |
Date | |
Msg-id | CAOtHd0Aj-F1ogXiEWE4wV5U8A8a-mxS=hYwx9B3fsg57hG2zWg@mail.gmail.com Whole thread Raw |
In response to | Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
|
List | pgsql-hackers |
On Thu, Oct 13, 2022 at 10:29 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > I think that it makes sense to count both the initial buffers added to > the ring and subsequent shared buffers added to the ring (either when > the current strategy buffer is pinned or in use or when a bulkread > rejects dirty strategy buffers in favor of new shared buffers) as > strategy clocksweeps because of how the statistic would be used. > > Clocksweeps give you an idea of how much of your working set is cached > (setting aside initially reading data into shared buffers when you are > warming up the db). You may use clocksweeps to determine if you need to > make shared buffers larger. > > Distinguishing strategy buffer clocksweeps from shared buffer > clocksweeps allows us to avoid enlarging shared buffers if most of the > clocksweeps are to bring in blocks for the strategy operation. > > However, I could see an argument that discounting strategy clocksweeps > done because the current strategy buffer is pinned makes the number of > shared buffer clocksweeps artificially low since those other queries > using the buffer would have suffered a cache miss were it not for the > strategy. And, in this case, you would take strategy clocksweeps > together with shared clocksweeps to make your decision. And if we > include buffers initially added to the strategy ring in the strategy > clocksweep statistic, this number may be off because those blocks may > not be needed in the main shared working set. But you won't know that > until you try to reuse the buffer and it is pinned. So, I think we don't > have a better option than counting initial buffers added to the ring as > strategy clocksweeps (as opposed to as reuses). > > So, in answer to your question, no, I cannot think of a scenario like > that. That analysis makes sense to me; thanks. > It also made me remember that I am incorrectly counting rejected buffers > as reused. I'm not sure if it is a good idea to subtract from reuses > when a buffer is rejected. Waiting until after it is rejected to count > the reuse will take some other code changes. Perhaps we could also count > rejections in the stats? I'm not sure what makes sense here. > > Not critical, but is there a list of backend types we could > > cross-reference elsewhere in the docs? > > The most I could find was this longer explanation (with exhaustive list > of types) in pg_stat_activity docs [1]. I could duplicate what it says > or I could link to the view and say "see pg_stat_activity" for a > description of backend_type" or something like that (to keep them from > getting out of sync as new backend_types are added. I suppose I could > also add docs on backend_types, but I'm not sure where something like > that would go. I think linking pg_stat_activity is reasonable for now. A separate section for this might be nice at some point, but that seems out of scope. > > From the io_context column description: > > > > + The autovacuum daemon, explicit <command>VACUUM</command>, > > explicit > > + <command>ANALYZE</command>, many bulk reads, and many bulk > > writes use a > > + fixed amount of memory, acquiring the equivalent number of > > shared > > + buffers and reusing them circularly to avoid occupying an > > undue portion > > + of the main shared buffer pool. > > + </para></entry> > > > > I don't understand how this is relevant to the io_context column. > > Could you expand on that, or am I just missing something obvious? > > > > I'm trying to explain why those other IO Contexts exist (bulkread, > bulkwrite, vacuum) and why they are separate from shared buffers. > Should I cut it altogether or preface it with something like: these are > counted separate from shared buffers because...? Oh I see. That makes sense; it just wasn't obvious to me this was talking about the last three values of io_context. I think a brief preface like that would be helpful (maybe explicitly with "these last three values", and I think "counted separately"). > > + <row> > > + <entry role="catalog_table_entry"><para > > role="column_definition"> > > + <structfield>extended</structfield> <type>bigint</type> > > + </para> > > + <para> > > + Extends of relations done by this > > <varname>backend_type</varname> in > > + order to write data in this <varname>io_context</varname>. > > + </para></entry> > > + </row> > > > > I understand what this is, but not why this is something I might want > > to know about. > > Unlike writes, backends largely have to do their own extends, so > separating this from writes lets us determine whether or not we need to > change checkpointer/bgwriter to be more aggressive using the writes > without the distraction of the extends. Should I mention this in the > docs? The other stats views don't seems to editorialize at all, and I > wasn't sure if this was an objective enough point to include in docs. Thanks for the clarification. Just to make sure I understand, you mean that if I see a high extended count, that may be interesting in terms of write activity, but I can't fix that by tuning--it's just the nature of my workload? I think you're right that this is not objective enough. It's unfortunate that there's not a good place in the docs for info like that, since stats like this are hard to interpret without that context, but I admit that it's not really this patch's job to solve that larger issue. > > That seems broadly reasonable, but pg_settings also has a 'unit' > > field, and in that view, unit is '8kB' on my system--i.e., it > > (presumably) reflects the block size. Is that something we should try > > to be consistent with (not sure if that's a good idea, but thought it > > was worth asking)? > > > > I think this idea is a good option. I am wondering if it would be clear > when mixed with non-block-oriented IO. Block-oriented IO would say 8kB > (or whatever the build-time value of a block was) and non-block-oriented > IO would say B or kB. The math would work out. Right, yeah. Although maybe that's a little confusing? When you originally added "unit", you had said: >The most correct thing to do to accommodate block-oriented and >non-block-oriented IO would be to specify all the values in bytes. >However, I would like this view to be usable visually (as opposed to >just in scripts and by tools). The only current value of unit is >"block_size" which could potentially be combined with the value of the >GUC to get bytes. Is this still usable visually if you have to compare values across units? I don't really have any great ideas here (and maybe this is still the best option), just pointing it out. > Looking at pg_settings now though, I am confused about > how the units for wal_buffers is 8kB but then the value of wal_buffers > when I show it in psql is "16MB"... You mean the difference between maciek=# select setting, unit from pg_settings where name = 'wal_buffers'; setting | unit ---------+------ 512 | 8kB (1 row) and maciek=# show wal_buffers; wal_buffers ------------- 4MB (1 row) ? Poking around, I think it looks like that's due to convert_int_from_base_unit (indirectly called from SHOW / current_setting): /* * Convert an integer value in some base unit to a human-friendly unit. * * The output unit is chosen so that it's the greatest unit that can represent * the value without loss. For example, if the base unit is GUC_UNIT_KB, 1024 * is converted to 1 MB, but 1025 is represented as 1025 kB. */ > Though the units for the pg_stat_io view for block-oriented IO would be > the build-time values for block size, so it wouldn't line up exactly > with pg_settings. I don't follow--what would be the discrepancy?
pgsql-hackers by date: