Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting - Mailing list pgsql-hackers

From Xuneng Zhou
Subject Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting
Date
Msg-id CABPTF7X7XmnkMBPD5EHXLy7kCB7pNq92wfciuXumG5DqjQnb-g@mail.gmail.com
Whole thread Raw
In response to Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting  (Xuneng Zhou <xunengzhou@gmail.com>)
Responses Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting
List pgsql-hackers
Hi,

On Thu, Aug 28, 2025 at 4:22 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hi,
>
> Some changes in v3:
> 1) Update the note of xlogwait.c to reflect the extending of its use
> for flush waiting and internal use for both flush and replay waiting.
> 2) Update the comment above logical_read_xlog_page which describes the
> prior-change behavior of read_local_xlog_page.

In an off-list discussion, Alexander pointed out potential issues with
the current single-heap design for replay and flush when promotion
occurs concurrently with WAIT FOR. The following is a simple example
illustrating the problem:

During promotion, there's a window where we can have mixed waiter
types in the same heap:

  T1: Process A calls read_local_xlog_page_guts on standby
  T2: RecoveryInProgress() = TRUE, adds to heap as replay waiter
  T3: Promotion begins
  T4: EndRecovery() calls WaitLSNWakeup(InvalidXLogRecPtr)
  T5: SharedRecoveryState = RECOVERY_STATE_DONE
  T6: Process B calls read_local_xlog_page_guts
  T7: RecoveryInProgress() = FALSE, adds to SAME heap as flush waiter

The problem is that replay LSNs and flush LSNs represent different
positions in the WAL stream. Having both types in the same heap can
lead to:
  - Incorrect wakeup logic (comparing incomparable LSNs)
  - Processes waiting forever
  - Wrong waiters being woken up

To avoid this problem, patch v4 is updated to utilize two separate
heaps for flush and replay like Alexander suggested earlier.  It also
introduces a new separate min LSN tracking field for flushing.

Best,
Xuneng

Attachment

pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Why cannot alter column type when a view depends on it?
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] GROUP BY ALL