Re: Streaming replication and WAL archive interactions - Mailing list pgsql-hackers

From Jaroslav Novikov
Subject Re: Streaming replication and WAL archive interactions
Date
Msg-id 9A271666-C8DA-455E-B5C7-48FF01CC72AB@yandex-team.ru
Whole thread Raw
In response to Re: Streaming replication and WAL archive interactions  (Andrey Borodin <x4mmm@yandex-team.ru>)
List pgsql-hackers

> On 12 Feb 2026, at 09:56, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>
> Hi Heikki,
>
> There’s a nearby thread [0] (about 10 years later) where I’m working on a problem your patch from this thread helps
solve.
>
> In datacenter large outages, 1–2% of clusters end up with gaps in their PITR timeline.
> In HA setups, when the primary is lost, some WAL can be missing from the archive even though it was streamed to the
standby.Many HA tools (PGConsul, Patroni, etc.) try to re-archive from the standby, but those WAL files may already
havebeen removed. 
>
> Your “shared” archive mode addresses this: the standby keeps WAL until it’s archived. archive_mode=always plus an
archivetool can work, but it’s expensive. In WAL-G, for example, the archive command does a GET on the standby’s WAL,
thendecrypts and compares. Switching to HEAD would reduce cost in some clouds but still adds cost. 
>
> Another option is coordinating archiving outside Postgres, but that would mean building distributed coordination into
thearchive tool. 
>
> Shared archive mode tackles this in Postgres itself.
>
> I’ve retrofitted your patch, incorporated ideas from the Greenplum work [1], and made some improvements.
>
> The patchset has three parts:
> * Rebase + tests – Your original patch, rebased, with tests added.
> * Timeline switching – Correct handling of timeline switches in archive status updates.
> * Avoid directory scans – Skip scanning archive_status when possible, which was costly in WAL-G setups.
>
> What do you think?
>
> Best regards, Andrey Borodin.
>
>
<v4-0001-Add-archive_mode-shared-for-coordinated-WAL-archi.patch><v4-0003-Optimize-ProcessArchivalReport-to-avoid-directory.patch><v4-0002-Mark-ancestor-timeline-WAL-segments-as-archived.patch>

Hi Andrey,

Adding the missing references [0] and [1].

[0] https://www.postgresql.org/message-id/5550D20D.6090703%40iki.fi
[1] https://github.com/open-gpdb/gpdb/commit/4f2db1929df1b5eed28f33505955636096bb4e8b

Best, Jaroslav Novikov.




pgsql-hackers by date:

Previous
From: Yura Sokolov
Date:
Subject: Re: Fix bug in multixact Oldest*MXactId initialization and access
Next
From: Álvaro Herrera
Date:
Subject: Re: Cleanup shadows variable warnings, round 1