Re: Postgresql 9.5: Streaming Replication: Secondaries Fail To Start Post WAL Error - Mailing list pgsql-admin

From Mohan NBSPS
Subject Re: Postgresql 9.5: Streaming Replication: Secondaries Fail To Start Post WAL Error
Date
Msg-id CAPCvfWevVWABj+Vroo8EFZYm4=JDvMzb3AAh9G0qdSrq-5O3Aw@mail.gmail.com
Whole thread Raw
In response to Re: Postgresql 9.5: Streaming Replication: Secondaries Fail To Start Post WAL Error  (Johannes Truschnigg <johannes@truschnigg.info>)
List pgsql-admin
Update

Regarding the failing secondaries, with error

```
LOG:  invalid resource manager ID
```

It was identified that the server running the secondaries was down for a considerable amount of time
(2-3 hours).
The errors started around that date.

Guess, the primaries (several of them) moved on writing to the WAL based on how busy they are
and the secondaries were out of sync in the WAL file.

Thank you for the advice.


On Wed, May 29, 2024 at 1:19 AM Johannes Truschnigg <johannes@truschnigg.info> wrote:
On Tue, May 28, 2024 at 05:24:56PM -0400, Ron Johnson wrote:
> On Tue, May 28, 2024 at 3:11 PM Johannes Truschnigg <
> >[...]
> > Yes, replication slots can interrupt your primary.
> >
>
> Please define "interrupt".  Using a replication slot, I thought files would
> just accumulate in pg_wal while the replica is down (or the network is
> slow, or the replica can't keep up with the primary).
>
> Disaster, of course, when that disk fills up, but that's always been the
> case.

And that is exactly the scenario I meant when I said "interrupt". If you use
replication slots, your monitoring/alerting isn't set up correctly, and you're
accumulating a lot of WAL, chances are ENOSPC on the primary is around the
corner for you.

That's why I generally prefer a WAL archive on a separate file system for
replicas to source segments from, because filling that up won't break the
primary (unless the archive_command misbehaves). That also needs proper
monitoring/alerting, of course (and a contingency plan for what to do when/if
the archive runs over) - but everyone whose workload is important enough for a
replication setup to make sense is required to have that in my book.

--
with best regards:
- Johannes Truschnigg ( johannes@truschnigg.info )

www:   https://johannes.truschnigg.info/

pgsql-admin by date:

Previous
From: Erik Wienhold
Date:
Subject: Re: psql - prompt for password
Next
From: ROHIT SACHDEVA
Date:
Subject: Re: Queries in replica are failing