Re: Race condition in recovery? - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: Race condition in recovery? |
Date | |
Msg-id | 20210524.134709.805985657416573716.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: Race condition in recovery? (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: Race condition in recovery?
|
List | pgsql-hackers |
At Sun, 23 May 2021 21:37:58 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in > On Sun, May 23, 2021 at 2:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Sat, May 22, 2021 at 8:33 PM Robert Haas <robertmhaas@gmail.com> wrote: > > I have created a tap test based on Robert's test.sh script. It > reproduces the issue. I am new with perl so this still needs some > cleanup/improvement, but at least it shows the idea. I'm not sure I'm following the discussion here, however, if we were trying to reproduce Dilip's case using base backup, we would need such a broken archive command if using pg_basebackup witn -Xnone. Becuase the current version of pg_basebackup waits for all required WAL segments to be archived when connecting to a standby with -Xnone. I don't bother reconfirming the version that fix took place, but just using -X stream instead of "none" we successfully miss the first segment of the new timeline in the upstream archive, though we need to erase pg_wal in the backup. Either the broken archive command or erasing pg_wal of the cascade is required to the behavior to occur. The attached is how it looks like. -- Kyotaro Horiguchi NTT Open Source Software Center # Copyright (c) 2021, PostgreSQL Global Development Group # Minimal test testing streaming replication use Cwd; use strict; use warnings; use PostgresNode; use TestLib; use Test::More tests => 1; # Initialize primary node my $node_primary = get_new_node('primary'); # A specific role is created to perform some tests related to replication, # and it needs proper authentication configuration. $node_primary->init(allows_streaming => 1); $node_primary->append_conf( 'postgresql.conf', qq( wal_keep_size=128MB )); $node_primary->start; my $backup_name = 'my_backup'; # Take backup $node_primary->backup($backup_name); my $node_standby_1 = get_new_node('standby_1'); $node_standby_1->init_from_backup($node_primary, $backup_name, allows_streaming => 1, has_streaming => 1); my $archivedir_standby_1 = $node_standby_1->archive_dir; $node_standby_1->append_conf( 'postgresql.conf', qq( archive_mode=always archive_command='cp "%p" "$archivedir_standby_1/%f"' )); $node_standby_1->start; # Take backup of standby 1 # NB: Use -Xnone so that pg_wal is empty. #$node_standby_1->backup($backup_name, backup_options => ['-Xnone']); $node_standby_1->backup($backup_name); # Promote the standby. $node_standby_1->psql('postgres', 'SELECT pg_promote()'); # clean up pg_wal from the backup my $pgwaldir = $node_standby_1->backup_dir. "/" . $backup_name . "/pg_wal"; opendir my $dh, $pgwaldir or die "failed to open $pgwaldir"; while (my $f = readdir($dh)) { unlink("$pgwaldir/$f") if (-f "$pgwaldir/$f"); } closedir($dh); # Create cascading standby but don't start it yet. # NB: Must set up both streaming and archiving. my $node_cascade = get_new_node('cascade'); $node_cascade->init_from_backup($node_standby_1, $backup_name, has_streaming => 1); $node_cascade->append_conf( 'postgresql.conf', qq( restore_command = 'cp "$archivedir_standby_1/%f" "%p"' log_line_prefix = '%m [%p:%b] %q%a ' archive_mode=off )); # Start cascade node $node_cascade->start; # Create some content on primary and check its presence in standby 1 $node_standby_1->safe_psql('postgres', "CREATE TABLE tab_int AS SELECT 1 AS a"); # Wait for standbys to catch up $node_standby_1->wait_for_catchup($node_cascade, 'replay', $node_standby_1->lsn('replay')); ok(1, 'test'); # it's sucess if we come here.
pgsql-hackers by date: