Re: Race conditions in 019_replslot_limit.pl - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: Race conditions in 019_replslot_limit.pl |
Date | |
Msg-id | 20220216.150119.226485024172638507.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: Race conditions in 019_replslot_limit.pl (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
List | pgsql-hackers |
At Wed, 16 Feb 2022 14:26:37 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in > Agreed. Doing this att all slot creation seems fine. Done in the attached. The first slot is deliberately created unreserved so I changed the code to re-create the slot with "reserved" before taking backup. > > Even though the node has log_disconnect = true, and other processes indeed log > > their disconnection, there's no disconnect for the above session until the > > server is shut down. Even though pg_basebackup clearly finished? Uh, huh? > > It seems to me so, too. > > > I guess it's conceivable that the backend was still working through process > > shutdown? But it doesn't seem too likely, given that several other connections > > manage to get through entire connect / disconnect cycles? > > Yes, but since postmaster seems thinking that process is gone. s/ since//; Whatever is happening at that time, I can make sure that walsender is gone before making a new replication connection, even though it doesn't "fix" any of the observed issues. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl index 4257bd4d35..059003d63a 100644 --- a/src/test/recovery/t/019_replslot_limit.pl +++ b/src/test/recovery/t/019_replslot_limit.pl @@ -35,6 +35,11 @@ my $result = $node_primary->safe_psql('postgres', ); is($result, "t|t|t", 'check the state of non-reserved slot is "unknown"'); +# re-create reserved replication slot before taking backup +$node_primary->safe_psql('postgres', q[ + SELECT pg_drop_replication_slot('rep1'); + SELECT pg_create_physical_replication_slot('rep1', true); +]); # Take backup my $backup_name = 'my_backup'; @@ -265,7 +270,7 @@ log_checkpoints = yes )); $node_primary2->start; $node_primary2->safe_psql('postgres', - "SELECT pg_create_physical_replication_slot('rep1')"); + "SELECT pg_create_physical_replication_slot('rep1', true)"); $backup_name = 'my_backup2'; $node_primary2->backup($backup_name); @@ -319,7 +324,7 @@ $node_primary3->append_conf( )); $node_primary3->start; $node_primary3->safe_psql('postgres', - "SELECT pg_create_physical_replication_slot('rep3')"); + "SELECT pg_create_physical_replication_slot('rep3', true)"); # Take backup $backup_name = 'my_backup'; $node_primary3->backup($backup_name); @@ -327,6 +332,14 @@ $node_primary3->backup($backup_name); my $node_standby3 = PostgreSQL::Test::Cluster->new('standby_3'); $node_standby3->init_from_backup($node_primary3, $backup_name, has_streaming => 1); + +# We will check for walsender process just after. Make sure no +# walsenders will stay sitting. +$node_primary3->poll_query_until('postgres', + "SELECT count(*) = 0 FROM pg_stat_activity WHERE backend_type = 'walsender'", + "t") + or die "timed out waiting for wealsender to get out"; + $node_standby3->append_conf('postgresql.conf', "primary_slot_name = 'rep3'"); $node_standby3->start; $node_primary3->wait_for_catchup($node_standby3);
pgsql-hackers by date: