Re: Use pgBufferUsage for block reporting in analyze - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Use pgBufferUsage for block reporting in analyze
Date
Msg-id CAD21AoDcT-EHeXk=QOJOmUGxzDcKCQnujTETV+WskKaFetyedw@mail.gmail.com
Whole thread Raw
In response to Re: Use pgBufferUsage for block reporting in analyze  (Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>)
Responses Re: Use pgBufferUsage for block reporting in analyze
List pgsql-hackers
On Wed, Jul 31, 2024 at 11:27 PM Anthonin Bonnefoy
<anthonin.bonnefoy@datadoghq.com> wrote:
>
> On Wed, Jul 31, 2024 at 9:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > Meanwhile, I think we can push 0001 and 0002 patches since they are in
> > good shape. I've updated commit messages to them and slightly changed
> > 0002 patch to write "finished analyzing of table \"%s.%s.%s\" instead
> > of  "analyze of table \"%s.%s.%s\".
>
> Wouldn't it make sense to do the same for autoanalyze and write
> "finished automatic analyze of table \"%s.%s.%s\"\n" instead of
> "automatic analyze of table \"%s.%s.%s\"\n"?

I think that the current style is consistent with autovacuum logs:

2024-08-01 16:04:48.088 PDT [12302] LOG:  automatic vacuum of table
"postgres.public.test": index scans: 0
        pages: 0 removed, 443 remain, 443 scanned (100.00% of total)
        tuples: 0 removed, 100000 remain, 0 are dead but not yet removable
        removable cutoff: 751, which was 0 XIDs old when operation ended
        new relfrozenxid: 739, which is 1 XIDs ahead of previous value
        frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
        index scan not needed: 0 pages from table (0.00% of total) had
0 dead item identifiers removed
        avg read rate: 0.000 MB/s, avg write rate: 1.466 MB/s
        buffer usage: 905 hits, 0 reads, 4 dirtied
        system usage: CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.02 s
2024-08-01 16:04:48.125 PDT [12302] LOG:  automatic analyze of table
"postgres.public.test"
        avg read rate: 5.551 MB/s, avg write rate: 0.617 MB/s
        buffer usage: 512 hits, 27 reads, 3 dirtied
        system usage: CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.03 s

Since ANALYZE command writes the start log, I think it makes sense to
write "finished" at the end of the operation:

=# analyze verbose test;
INFO:  analyzing "public.test"
INFO:  "test": scanned 443 of 443 pages, containing 100000 live rows
and 0 dead rows; 30000 rows in sample, 100000 estimated total rows
INFO:  finished analyzing table "postgres.public.test"
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
buffer usage: 549 hits, 0 reads, 0 dirtied
system usage: CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.03 s
ANALYZE

>
> > Also, regarding 0003 patch, what is the main reason why we want to add
> > WAL usage to analyze reports? I think that analyze normally does not
> > write WAL records much so I'm not sure it's going to provide a good
> > insight for users.
>
> There was no strong reason except for consistency with VACUUM VERBOSE
> output. But as you said, it's not really providing valuable
> information so it's probably better to keep the noise down and drop
> it.

Okay. I think writing WAL usage would not be very noisy and probably
could help some cases where (auto)analyze unexpectedly writes many WAL
records (e.g., writing full page images much), and is consistent with
(auto)vacuum logs as you mentioned. So let's go with this direction
unless others think differently.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Why is citext/regress failing on hamerkop?
Next
From: Maciek Sakrejda
Date:
Subject: Re: Casts from jsonb to other types should cope with json null