Home > mailing lists

Prevent pg_rewind destroying the data - Mailing list pgsql-general

From	Christopher Pereira
Subject	Prevent pg_rewind destroying the data
Date	December 20, 2020 19:11:25
Msg-id	f4fb86c6-b219-5cad-5dc3-c9c1a720fdb2@imatronix.cl Whole thread Raw
List	pgsql-general

Tree view

Hi,

When pg_rewind is interrupted due to network errors, the cluster gets corrupted:

Running pg_rewind for a second time returns "pg_rewind: fatal: target server must be shut down cleanly".

Trying to fix the cluster with "/usr/pgsql-12/bin/postmaster' --single -F -D '/var/lib/pgsql/12/mydb' -c archive_mode=on -c archive_command=false" throws:

LOG: could not read from log segment 0000003B000000000000003E, offset 0: read 0 of 8192
LOG: invalid primary checkpoint record
PANIC: could not locate a valid checkpoint record

When a cluster failsover because of a network problem, chances are high that another network problem may occur while we run pg_rewind.
It would be nice if pg_rewind wouldn't destroy the data and leave the cluster in a state where retrying pg_rewind can succeed.

As a workaround we are thinking in taking a LVM snapshot or do a "cp --reflink" before running pg_rewind and restore if there is a failure, but it would be nice if pg_rewind were "non destructive".

Is this possible?
Am I missing something?

We are using PG 12.

pgsql-general by date:

From: Marcin Giedz
Date: 20 December 2020, 18:52:01
Subject: some questions regarding replication issues and timeline/history files

From: Daniele Varrazzo
Date: 21 December 2020, 14:56:41
Subject: psycopg3: prepared statements

Prevent pg_rewind destroying the data - Mailing list pgsql-general

Previous

Next