Thread: Re: Postgresql replication failed in Patroni

Re: Postgresql replication failed in Patroni

From
Raphael Salguero Aragón
Date:
Hi Mendbayar,

Am Fr., 7. Feb. 2025 um 07:04 Uhr schrieb Mendbayar Alzakhgui <mendbayar.alz@unitel.mn>:

Hello everybody,
I need a urgent help on my Patroni managed postgres cluster,

the main patroni managed leader postgres crushed and down, when we try to start the Postgresql it’s showing us this error log

2025-02-07 12:31:18 +08 [2354332]: [4-1] user=,db=,app=,client=LOG:  listening on IPv4 address "ip_address", port 5432

2025-02-07 12:31:18 +08 [2354332]: [5-1] user=,db=,app=,client=LOG:  listening on Unix socket "./.s.PGSQL.5432"

2025-02-07 12:31:18 +08 [2354337]: [1-1] user=,db=,app=,client=LOG:  database system was shut down in recovery at 2025-02-07 11:56:50 +08

2025-02-07 12:31:18 +08 [2354337]: [2-1] user=,db=,app=,client=LOG:  entering standby mode

2025-02-07 12:31:18 +08 [2354337]: [3-1] user=,db=,app=,client=FATAL:  requested timeline 20 is not a child of this server's history

2025-02-07 12:31:18 +08 [2354337]: [4-1] user=,db=,app=,client=DETAIL:  Latest checkpoint is at 71/4D8BB8C0 on timeline 19, but in the history of the requested timeline, the server forked off from that timeline at 71/4D793220.

2025-02-07 12:31:18 +08 [2354332]: [6-1] user=,db=,app=,client=LOG:  startup process (PID 2354337) exited with exit code 1

2025-02-07 12:31:18 +08 [2354332]: [7-1] user=,db=,app=,client=LOG:  aborting startup due to startup process failure

2025-02-07 12:31:18 +08 [2354332]: [8-1] user=,db=,app=,client=LOG:  database system is shut down


what should we check?, and is this because the leader node already deleted the wal it’s needed to start? And we were connected debezium to this node when we recover it will the debezium start automatically from the disconnected sessions? Please help me.

You're right, the crashed DB is not able to recover due to a lag of transactional information.
What is your DB size?

The easiest way is to stop Patroni on the crashed instance (systemctl stop patroni), remove and recreate the data directory (also take care about tablespace if they're in use).
Afterwards, you can restart the Patroni service on the crashed instance and run a reinit from the current leader:

patronictl -c /etc/patroni.yml reinit your_cluster_name replica_node

That should do the trick :)
 

Sincerely,


Mendbayar A.
| Database Administrator

Information technology department

 

+976 8611-2165

mendbayar.alz@unitel.mn

Central Tower, 11th floor

www.unitel.mn

 

Best regards
 Raphael 

Re: Postgresql replication failed in Patroni

From
Laurenz Albe
Date:
Mendbayar Alzakhgui wrote:
> FATAL:  requested timeline 20 is not a child of this server's history
> DETAIL:  Latest checkpoint is at 71/4D8BB8C0 on timeline 19, but in the history of the requested timeline, the server
forkedoff from that timeline at 71/4D793220. 

The solution for that is usually to remove (keep a copy somewhere) the
file 00000014.history from the WAL archive.  That file probably got
archived by a promoting server.

Yours,
Laurenz Albe