Re: pg_stat_io for the startup process - Mailing list pgsql-hackers

From Melih Mutlu
Subject Re: pg_stat_io for the startup process
Date
Msg-id CAGPVpCQHbiREfTrR=c4jH-5nVev8D613wv5fp-J+aOJmhZ0-Pg@mail.gmail.com
Whole thread Raw
In response to Re: pg_stat_io for the startup process  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: pg_stat_io for the startup process
Re: pg_stat_io for the startup process
List pgsql-hackers
Hi all,

Robert Haas <robertmhaas@gmail.com>, 26 Nis 2023 Çar, 15:34 tarihinde şunu yazdı:
On Wed, Apr 26, 2023 at 5:47 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> 3. When should we call pgstat_report_stats on the startup process?
>
> During recovery, I think we can call pgstat_report_stats() (or a
> subset of it) right before invoking WaitLatch and at segment
> boundaries.

I think this kind of idea is worth exploring. Andres mentioned timers,
but it seems to me that something where we just do it at certain
"convenient" points might be good enough and simpler. I'd much rather
have statistics that were up to date as of the last time we finished a
segment than have nothing at all.

I created a rough prototype of a timer-based approach for comparison. Please see attached.

Registered a new timeout named "STARTUP_STAT_FLUSH_TIMEOUT", The timeout's handler sets a flag indicating that io stats need to be flushed. HandleStartupProcInterrupts checks the flag to flush stats.
It's enabled if any WAL record is read (in the main loop of PerformWalRecovery). And it's disabled before WaitLatch to avoid unnecessary wakeups if the startup process decides to sleep.

With those changes, I see that startup related rows in pg_stat_io are updated without waiting for the startup process to exit.

postgres=# select context, reads, extends, hits from pg_stat_io where backend_type = 'startup';
  context  | reads  | extends |   hits
-----------+--------+---------+----------
 bulkread  |      0 |         |        0
 bulkwrite |      0 |       0 |        0
 normal    |      6 |       1 |       41
 vacuum    |      0 |       0 |        0
(4 rows)

I'm not sure this is the correct way to implement this approach though. I appreciate any comment.

Also; some questions about this implementation if you think it's worth discussing:
1- I set an arbitrary timeout (1 sec) for testing. I don't know what the correct value should be. Does it make sense to use PGSTAT_MIN/MAX/IDLE_INTERVAL instead?
2- I'm also not sure if this timeout should be registered at the beginning of StartupProcessMain, or does it even make any difference. I tried to do this just before the main redo loop in PerformWalRecovery, but that made the CI red.

Best,
--
Melih Mutlu
Microsoft
Attachment

pgsql-hackers by date:

Previous
From: "Drouvot, Bertrand"
Date:
Subject: Re: Autogenerate some wait events code and documentation
Next
From: Andrew Dunstan
Date:
Subject: Re: issue with meson builds on msys2