Re: pg_stat_io for the startup process - Mailing list pgsql-hackers

From Melih Mutlu
Subject Re: pg_stat_io for the startup process
Date
Msg-id CAGPVpCRUWfq4Wb+a1+uBmqQCGt-HtqgxtB+MfaiC48Cpo-AmcA@mail.gmail.com
Whole thread Raw
In response to Re: pg_stat_io for the startup process  (Andres Freund <andres@anarazel.de>)
Responses Re: pg_stat_io for the startup process
Re: pg_stat_io for the startup process
List pgsql-hackers
Hi,

Andres Freund <andres@anarazel.de>, 27 Nis 2023 Per, 19:27 tarihinde şunu yazdı:
Huh - do you have a link to the failure? That's how I would expect it to be
done.

I think I should have registered it in the beginning of PerformWalRecovery() and not just before the main redo loop since HandleStartupProcInterrupts is called before the loop too. 
Here's the error log otherise [1] 

>  #ifdef WAL_DEBUG
>                       if (XLOG_DEBUG ||
>                               (record->xl_rmid == RM_XACT_ID && trace_recovery_messages <= DEBUG2) ||
> @@ -3617,6 +3621,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
>                                               /* Do background tasks that might benefit us later. */
>                                               KnownAssignedTransactionIdsIdleMaintenance();

> +                                             /*
> +                                              * Need to disable flush timeout to avoid unnecessary
> +                                              * wakeups. Enable it again after a WAL record is read
> +                                              * in PerformWalRecovery.
> +                                              */
> +                                             disable_startup_stat_flush_timeout();
> +
>                                               (void) WaitLatch(&XLogRecoveryCtl->recoveryWakeupLatch,
>                                                                                WL_LATCH_SET | WL_TIMEOUT |
>                                                                                WL_EXIT_ON_PM_DEATH,

I think always disabling the timer here isn't quite right - we want to wake up
*once* in WaitForWALToBecomeAvailable(), otherwise we'll not submit pending
stats before waiting - potentially for a long time - for WAL. One way would be
just explicitly report before the WaitLatch().

Another approach, I think better, would be to not use enable_timeout_every(),
and to explicitly rearm the timer in HandleStartupProcInterrupts(). When
called from WaitForWALToBecomeAvailable(), we'd not rearm, and instead do so
at the end of WaitForWALToBecomeAvailable().  That way we also wouldn't
repeatedly fire between calls to HandleStartupProcInterrupts().

Attached patch is probably not doing what you asked. IIUC HandleStartupProcInterrupts should arm the timer but also shouldn't arm it if it's called from  WaitForWALToBecomeAvailable. But the timer should be armed again at the very end of WaitForWALToBecomeAvailable. I've been thinking about how to do this properly. Should HandleStartupProcInterrupts take a parameter to decide whether the timer needs to be armed? Or need to add an additional global flag to rearm the timer? Any thoughts?


Best,
--
Melih Mutlu
Microsoft
Attachment

pgsql-hackers by date:

Previous
From: "Drouvot, Bertrand"
Date:
Subject: Re: Add two missing tests in 035_standby_logical_decoding.pl
Next
From: Robert Haas
Date:
Subject: Re: Logging parallel worker draught