Thread: Cascading Replication - Standby Recovering Faster from Archive rather than Upstream node

Hello All, 

We have a cascading replication setup with multiple nodes 2 of which are in a different cloud region for DR purposes. 

DRnode01 > DRnode02 

They are both standby nodes, DRnode01 only recovers from the archive and does not connect to any upstream node for streaming. DRnode02 has the same restore_command which points to that WAL archive, but it is also configured with primary_conninfo to stream from DRnode01. I'd like it to just WAL stream instead of using the archive, but it just ends up recovering slightly faster from the archive than DRnode01 can send WALs so the logs end up being spammed with 

2023-02-09 13:32:00 EST [1277714]: [1-1] user=,db=,app=,client= LOG:  started streaming WAL from primary at 74CC/B5000000 on timeline 33
2023-02-09 13:32:00 EST [1277714]: [2-1] user=,db=,app=,client= FATAL:  could not receive data from WAL stream: ERROR:  requested starting point 74CC/B5000000 is ahead of the WAL flush position of this server 74CC/B4FFFA08. 

pg_wal_replay_pause() does not work, it ends up in the same situation after resuming. Changing restore_command requires a restart and turning it off altogether is not good for DR.

I cannot get it out of this loop and this has been a recurring issue for a while. 

Is there anything I can do to force to WAL stream instead of recovering from the archive without removing the restore_command setting?


On Feb 9, 2023, at 1:41 PM, Tim <timfosho@gmail.com> wrote:

Is there anything I can do to force to WAL stream instead of recovering from the archive without removing the restore_command setting?

Yes, you can use a replication slot to ensure that the WAL file is available on the primary until the DR node consumes it.  You have to make sure you have enough WAL free space to accommodate any lag or disruption issues.
On Thu, 2023-02-09 at 13:41 -0500, Tim wrote:
> We have a cascading replication setup with multiple nodes 2 of which are in a different
> cloud region for DR purposes. 
>
> DRnode01 > DRnode02 
>
> They are both standby nodes, DRnode01 only recovers from the archive and does not connect
> to any upstream node for streaming. DRnode02 has the same restore_command which points to
> that WAL archive, but it is also configured with primary_conninfo to stream from DRnode01.
> I'd like it to just WAL stream instead of using the archive, but it just ends up recovering
> slightly faster from the archive than DRnode01 can send WALs so the logs end up being
> spammed with 
>
> > 2023-02-09 13:32:00 EST [1277714]: [1-1] user=,db=,app=,client= LOG:  started streaming WAL from primary at
74CC/B5000000on timeline 33 
> > 2023-02-09 13:32:00 EST [1277714]: [2-1] user=,db=,app=,client= FATAL:  could not receive data from WAL stream:
ERROR: requested starting point 74CC/B5000000 is ahead of the WAL flush position of 
> > this server 74CC/B4FFFA08. 
>
> pg_wal_replay_pause() does not work, it ends up in the same situation after resuming.
> Changing restore_command requires a restart and turning it off altogether is not good
> for DR.
>
> I cannot get it out of this loop and this has been a recurring issue for a while. 
>
> Is there anything I can do to force to WAL stream instead of recovering from the archive
> without removing the restore_command setting?

My idea:

- On DRnode01, set "archive_mode = always" and configure an "archive_command" that copies
  WAL segments to a directory shared with DRnode02 (e.g. via NFS)
- DRnode02 has "primary_conninfo" that connects to DRnode01 and a "restore_command" that
  copies WAL segments from the shared directory.
- DRnode02 has an "archive_cleanup_command" that removes WAL segments that it no longer
  needs from the shared directory.

Yours,
Laurenz Albe