Home > mailing lists

Re: Syncrep and improving latency due to WAL throttling - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Syncrep and improving latency due to WAL throttling
Date	January 26, 2023 18:49:12
Msg-id	20230126154912.okzh7psxox6aphdm@awork3.anarazel.de Whole thread Raw
In response to	Re: Syncrep and improving latency due to WAL throttling (Jakub Wartak <jakub.wartak@enterprisedb.com>)
Responses	Re: Syncrep and improving latency due to WAL throttling
List	pgsql-hackers

Tree view

Hi,

On 2023-01-26 14:40:56 +0100, Jakub Wartak wrote:
> In summary:  Attached is a slightly reworked version of this patch.
> 1. Moved logic outside XLogInsertRecord() under ProcessInterrupts()
> 2. Flushes up to the last page boundary, still uses SyncRepWaitForLSN()
> 3. Removed GUC for now (always on->256kB); potentially to be restored

Huh? Why did you remove the GUC? Clearly this isn't something we can default
to on.


> diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
> index d85e313908..05d56d65f9 100644
> --- a/src/backend/access/transam/xact.c
> +++ b/src/backend/access/transam/xact.c
> @@ -2395,6 +2395,7 @@ CommitTransaction(void)
>  
>      XactTopFullTransactionId = InvalidFullTransactionId;
>      nParallelCurrentXids = 0;
> +    backendWalInserted = 0;
>  
>      /*
>       * done with commit processing, set current transaction state back to

I don't like the resets around like this. Why not just move it into
XLogFlush()?



> +/*
> + * Called from ProcessMessageInterrupts() to avoid waiting while being in critical section
> + */ 
> +void HandleXLogDelayPending()
> +{
> +    /* flush only up to the last fully filled page */
> +    XLogRecPtr     LastFullyWrittenXLogPage = XactLastRecEnd - (XactLastRecEnd % XLOG_BLCKSZ);
> +    XLogDelayPending = false;

Hm - wonder if it'd be better to determine the LSN to wait for in
XLogInsertRecord(). There'll be plenty cases where we'll insert multiple (but
bounded number of) WAL records before processing interrupts. No need to flush
more aggressively than necessary...


> +    //HOLD_INTERRUPTS();
> +
> +    /* XXX Debug for now */
> +    elog(WARNING, "throttling WAL down on this session (backendWalInserted=%d, LSN=%X/%X flushingTo=%X/%X)", 
> +        backendWalInserted, 
> +        LSN_FORMAT_ARGS(XactLastRecEnd),
> +        LSN_FORMAT_ARGS(LastFullyWrittenXLogPage));
> +
> +    /* XXX: refactor SyncRepWaitForLSN() to have different waitevent than default WAIT_EVENT_SYNC_REP  */
> +    /* maybe new WAIT_EVENT_SYNC_REP_BIG or something like that */
> +    XLogFlush(LastFullyWrittenXLogPage);
> +    SyncRepWaitForLSN(LastFullyWrittenXLogPage, false);

SyncRepWaitForLSN() has this comment:
 * 'lsn' represents the LSN to wait for.  'commit' indicates whether this LSN
 * represents a commit record.  If it doesn't, then we wait only for the WAL
 * to be flushed if synchronous_commit is set to the higher level of
 * remote_apply, because only commit records provide apply feedback.


> +    elog(WARNING, "throttling WAL down on this session - end");
> +    backendWalInserted = 0;
> +
> +    //RESUME_INTERRUPTS();
> +}

I think we'd want a distinct wait event for this.



> diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
> index 1b1d814254..8ed66b2eae 100644
> --- a/src/backend/utils/init/globals.c
> +++ b/src/backend/utils/init/globals.c
> @@ -37,6 +37,7 @@ volatile sig_atomic_t IdleSessionTimeoutPending = false;
>  volatile sig_atomic_t ProcSignalBarrierPending = false;
>  volatile sig_atomic_t LogMemoryContextPending = false;
>  volatile sig_atomic_t IdleStatsUpdateTimeoutPending = false;
> +volatile sig_atomic_t XLogDelayPending = false;
>  volatile uint32 InterruptHoldoffCount = 0;
>  volatile uint32 QueryCancelHoldoffCount = 0;
>  volatile uint32 CritSectionCount = 0;

I don't think the new variable needs to be volatile, or even
sig_atomic_t. We'll just manipulate it from outside signal handlers.


Greetings,

Andres Freund

pgsql-hackers by date:

From: Andres Freund
Date: 26 January 2023, 18:40:29
Subject: Re: Syncrep and improving latency due to WAL throttling

From: Andres Freund
Date: 26 January 2023, 18:51:24
Subject: Re: Syncrep and improving latency due to WAL throttling

Re: Syncrep and improving latency due to WAL throttling - Mailing list pgsql-hackers

Previous

Next