Re: SLRU statistics - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: SLRU statistics
Date
Msg-id 20200229205518.xrxo535zgd7cl6no@development
Whole thread Raw
In response to Re: SLRU statistics  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: SLRU statistics
List pgsql-hackers
On Sat, Feb 29, 2020 at 11:44:26AM -0300, Alvaro Herrera wrote:
>On 2020-Feb-29, Tomas Vondra wrote:
>
>> Did we actually remove track-enabling GUCs? I think we still have
>>
>>  - track_activities
>>  - track_counts
>>  - track_io_timing
>>  - track_functions
>>
>> But maybe I'm missing something?
>
>Hm I remembered we removed the one for row-level stats
>(track_row_stats), but what we really did is merge it with block-level
>stats (track_block_stats) into track_counts -- commit 48f7e6439568.
>Funnily enough, if you disable that autovacuum won't work, so I'm not
>sure it's a very useful tunable.  And it definitely has more overhead
>than what this new GUC would have.
>

OK

>> > I find SlruType pretty odd, and the accompanying "if" list in
>> > pg_stat_get_slru() correspondingly so.  Would it be possible to have
>> > each SLRU enumerate itself somehow?  Maybe add the name in SlruCtlData
>> > and query that, somehow.  (I don't think we have an array of SlruCtlData
>> > anywhere though, so this might be a useless idea.)
>>
>> Well, maybe. We could have a system to register SLRUs dynamically, but
>> the trick here is that by having a fixed predefined number of SLRUs
>> simplifies serialization in pgstat.c and so on. I don't think the "if"
>> branches in pg_stat_get_slru() are particularly ugly, but maybe we could
>> replace the enum with a registry of structs, something like rmgrlist.h.
>> It seems like an overkill to me, though.
>
>Yeah, maybe we don't have to fix that now.
>

IMO the current solution is sufficient for the purpose. I guess we could
just stick a name into the SlruCtlData (and remove SlruType entirely),
and use that to identify the stats entries. That might be enough, and in
fact we already have that - SimpleLruInit gets a name parameter and
copies that to the lwlock_tranche_name.

One of the main reasons why I opted to use the enum is that it makes
tracking, lookup and serialization pretty trivial - it's just an index
lookup, etc. But maybe it wouldn't be much more complex with the name, 
considering the name length is limited by SLRU_MAX_NAME_LENGTH. And we
probably don't expect many entries, so we could keep them in a simple
list, or maybe a simplehash.

I'm not sure what to do with data for SLRUs that might have disappeared
after a restart (e.g. because someone removed an extension). Until now
those would be in the all in the "other" entry.


The attached v2 fixes the issues in your first message:

- I moved the page_miss() call after SlruRecentlyUsed(), but then I
   realized it's entirely duplicate with the page_read() update done in
   SlruPhysicalReadPage(). I removed the call from SlruPhysicalReadPage()
   and renamed page_miss to page_read - that's more consistent with
   shared buffers stats, which also have buffers_hit and buffer_read.

- I've also implemented the reset. I ended up adding a new option to
   pg_stat_reset_shared, which always resets all SLRU entries. We track
   the reset timestamp for each SLRU entry, but the value is always the
   same. I admit this is a bit weird - I did it like this because (a) I'm
   not sure how to identify the individual entries and (b) the SLRU is
   shared, so pg_stat_reset_shared seems kinda natural.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Catalog invalidations vs catalog scans vs ScanPgRelation()
Next
From: Tom Lane
Date:
Subject: Re: Allow to_date() and to_timestamp() to accept localized names