Home > mailing lists

Race condition with restore_command on streaming replica - Mailing list pgsql-general

From	Brad Nicholson
Subject	Race condition with restore_command on streaming replica
Date	November 4, 2020 20:09:40
Msg-id	OFD21F6F73.D0A0B069-ON85258616.006BFBC1-85258616.006EBFE1@notes.na.collabserv.com Whole thread Raw
Responses	Re: Race condition with restore_command on streaming replica
List	pgsql-general

Tree view

Hi,

I've recently been seeing a race condition with the restore_command on replicas using streaming replication.

On the primary, we are archiving wal files to s3 compatible storage via pgBackRest. In the recovery.conf section of the postgresql.conf file on the replicas, we define the restore command as follows:

restore_command = '/usr/bin/pgbackrest --config /conf/postgresql/pgbackrest_restore.conf --stanza=formation archive-get %f "%p"'

We have a three member setup - m-0, m-1, m-2. Consider the case where m-0 is the Primary and m-1 and m-2 are replicas connected to the m-0.

When issuing a switchover (via Patroni) from m-0 to m-1, the connection from m-2 to m-0 is terminated. The restore_command on m-2 is run, and it looks for the .history file for the new timeline. If this happens before the history file is created and pushed to the archive, m-2 will look for the next wal file on the existing timeline in the archive. It will never be created as the source has moved on, so this m-2 hangs waiting on that file. The restore_command on the replica looking for this non-existent file is only run once. This seems like an odd state to be in. The replica is waiting on a new file, but it's not actually looking for it. Is this expected, or should the restore_command be polling for that file?

Restarting m-2 causes it to reconnect, find the .history file in the archive and replication picks up. But that is less than ideal.

This is Postgres 12.4, managed by Patroni. I reported this to Patroni first for feedback, and they pointed to it be being a Postgres issue - not a Patroni one.

Here is a summarized timeline of events from one instance, cluster is on Time Line 9

2020-11-04 14:51:11 m-0 is shutdown
2020-11-04 14:51:13.015 m-2 replication connection is terminated
2020-11-04 14:51:13.015 m-2 looks for 0000000A.history in archive, does not find it
2020-11-04 14:51:17,137 m-2 looks for 0000000900000000000000B6 in archive (this file won't ever exist)
2020-11-04 14:51:20,177 m-1 pushes 0000000A.history to archive

At this point, m-2 is stuck until reboot

There is a detailed bug report on the Patroni list with all the logs if relevant:
https://github.com/zalando/patroni/issues/1747#issue-736244675

pgsql-general by date:

From: Adrian Klaver
Date: 04 November 2020, 16:42:54
Subject: Re: PANIC: could not write to log file {} at offset {}, length {}: Invalid argument

From: Steve Singer
Date: 04 November 2020, 23:29:18
Subject: Christopher Browne

Race condition with restore_command on streaming replica - Mailing list pgsql-general

Previous

Next