Thread: Use pgBufferUsage for block reporting in analyze

Use pgBufferUsage for block reporting in analyze

From
Anthonin Bonnefoy
Date:
Hi,

Analyze logs within autovacuum uses specific variables VacuumPage{Hit,Miss,Dirty} to track the buffer usage count. However, pgBufferUsage already provides block usage tracking and handles more cases (temporary tables, parallel workers...).

Those variables were only used in two places, block usage reporting in verbose vacuum and analyze. 5cd72cc0c5017a9d4de8b5d465a75946da5abd1d removed their usage in the vacuum command as part of a bugfix. 

This patch replaces those Vacuum specific variables by pgBufferUsage in analyze. This makes VacuumPage{Hit,Miss,Dirty} unused and removable. This commit removes both their calls in bufmgr and their declarations.

Regards,
Anthonin
Attachment

Re: Use pgBufferUsage for block reporting in analyze

From
Michael Paquier
Date:
On Fri, May 10, 2024 at 10:54:07AM +0200, Anthonin Bonnefoy wrote:
> This patch replaces those Vacuum specific variables by pgBufferUsage
> in analyze. This makes VacuumPage{Hit,Miss,Dirty} unused and removable.
> This commit removes both their calls in bufmgr and their declarations.

Hmm, yeah, it looks like you're right.  I can track all the blocks
read, hit and dirtied for VACUUM and ANALYZE in all the code path
where these removed variables were incremented.  This needs some
runtime check to make sure that the calculations are consistent before
and after the fact (cannot do that now).

             appendStringInfo(&buf, _("buffer usage: %lld hits, %lld misses, %lld dirtied\n"),
-                             (long long) AnalyzePageHit,
-                             (long long) AnalyzePageMiss,
-                             (long long) AnalyzePageDirty);
+                             (long long) (bufferusage.shared_blks_hit + bufferusage.local_blks_hit),
+                             (long long) (bufferusage.shared_blks_read + bufferusage.local_blks_read),
+                             (long long) (bufferusage.shared_blks_dirtied + bufferusage.local_blks_dirtied));

Perhaps this should say "read" rather than "miss" in the logs as the
two read variables for the shared and local blocks are used?  For
consistency, at least.

That's not material for v17, only for v18.
--
Michael

Attachment

Re: Use pgBufferUsage for block reporting in analyze

From
Anthonin Bonnefoy
Date:
Thanks for having a look.

On Fri, May 10, 2024 at 12:40 PM Michael Paquier <michael@paquier.xyz> wrote:
This needs some runtime check to make sure that the calculations 
are consistent before and after the fact (cannot do that now).
Yeah, testing this is also a bit painful as buffer usage of analyze is only displayed in the logs during autoanalyze. While looking at this, I've thought of additional changes that could make testing easier and improve consistency with VACUUM VERBOSE:
- Have ANALYZE VERBOSE outputs the buffer usage stats
- Add Wal usage to ANALYZE VERBOSE

analyze verbose output would look like: 
postgres=# analyze (verbose) pgbench_accounts ;
INFO:  analyzing "public.pgbench_accounts"
INFO:  "pgbench_accounts": scanned 1640 of 1640 pages, containing 100000 live rows and 0 dead rows; 30000 rows in sample, 100000 estimated total rows
INFO:  analyze of table "postgres.public.pgbench_accounts"
avg read rate: 124.120 MB/s, avg write rate: 0.110 MB/s
buffer usage: 533 hits, 1128 reads, 1 dirtied
WAL usage: 12 records, 1 full page images, 5729 bytes
system usage: CPU: user: 0.06 s, system: 0.00 s, elapsed: 0.07 s 

Perhaps this should say "read" rather than "miss" in the logs as the
two read variables for the shared and local blocks are used?  For
consistency, at least.
Sounds good. 

That's not material for v17, only for v18.
 Definitely

I've split the patch in two parts
1: Removal of the vacuum specific variables, this is the same as the initial patch.
2: Add buffer and wal usage to analyze verbose output + rename miss to reads 

Regards,
Anthonin
Attachment