Re: Requested WAL segment xxx has already been removed - Mailing list pgsql-hackers

From Alexander Kukushkin
Subject Re: Requested WAL segment xxx has already been removed
Date
Msg-id CAFh8B=nH=41scx3F_EAh2H5O1w-nj8b7uCMp_0z4p4wsv2tFDA@mail.gmail.com
Whole thread Raw
In response to Re: Requested WAL segment xxx has already been removed  (Japin Li <japinli@hotmail.com>)
Responses Re: Requested WAL segment xxx has already been removed
List pgsql-hackers
Hi,

On Mon, 14 Jul 2025 at 11:24, Japin Li <japinli@hotmail.com> wrote:
The configuration is as expected. My test script simulates two distinct hosts
by utilizing local archive storage.

For physical replication across distinct hosts without shared WAL archive
storage, WALs are archived locally (in my test).

When the primary's walsender needs a WAL file from the archive that's not in
its pg_wal directory, manual copying is required to the primary's pg_wal or the
standby's pg_wal (or its archive directory, and use restore_command to fetch it).

What prevents us from using the primary's restore_command to retrieve the
necessary WALs?

I am just talking about the practical side of local archive storage.
Such archives will be gone along with the server in case of disaster and therefore they bring only a little value.
With the same success, physical standby can use restore_command to copy files from the archive on the primary via ssh/rsync or similar. This approach is used for ages and works just fine.

What is really painful right now, logical walsenders can only look into pg_wal, and unfortunately replication slots don't give 100% guarantee for WAL retention because of max_slot_wal_keep_size.
That is, using restore_command for logical walsenders would be really helpful and solve some problems and pain points with logical replication.

However, if we start calling restore_command also for physical walsenders it might result in increased resource usage on primary without providing much additional value. For example, restore_command is failing, but standby indefinitely continues making replication connection attempts.

I don't mind if it will also work for physical replication, but IMO there should be a possibility to opt out from it.

Regards,
--
Alexander Kukushkin

pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages
Next
From: "cca5507"
Date:
Subject: Logical replication launcher did not automatically restart when got SIGKILL