Re: Replication slot drop message is sent after pgstats shutdown. - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Replication slot drop message is sent after pgstats shutdown.
Date
Msg-id 20211213.121138.689699294111260502.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Replication slot drop message is sent after pgstats shutdown.  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
At Fri, 10 Dec 2021 18:13:31 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in 
> I agreed with Andres and Horiguchi-san and attached an updated patch.

Thanks for the new version.

It seems fine, but I have some comments from a cosmetic viewpoint.

+    /*
+     * Register callback to make sure cleanup and releasing the replication
+     * slot on exit.
+     */
+    ReplicationSlotInitialize();

Actually all the function does is that but it looks slightly
inconsistent with the function name. I think we can call it just
"initialization" here. I think we don't need to change the function
comment the same way but either will do for me.

+ReplicationSlotBeforeShmemExit(int code, Datum arg)

The name looks a bit too verbose. Doesn't just "ReplicationSlotShmemExit" work?

-         * so releasing here is fine. There's another cleanup in ProcKill()
-         * ensuring we'll correctly cleanup on FATAL errors as well.
+         * so releasing here is fine. There's another cleanup in
+         * ReplicationSlotBeforeShmemExit() callback ensuring we'll correctly
+         * cleanup on FATAL errors as well.

I'd prefer "during before_shmem_exit()" than "in
ReplicationSlotBeforeShmemExit() callback" here. (But the current
wording is also fine by me.)


The attached detects that bug, but I'm not sure it's worth expending
test time, or this might be in the server test suit.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/pg_basebackup/t/030_pg_recvlogical.pl b/src/bin/pg_basebackup/t/030_pg_recvlogical.pl
index 90da1662e3..0fb4e67697 100644
--- a/src/bin/pg_basebackup/t/030_pg_recvlogical.pl
+++ b/src/bin/pg_basebackup/t/030_pg_recvlogical.pl
@@ -5,7 +5,8 @@ use strict;
 use warnings;
 use PostgreSQL::Test::Utils;
 use PostgreSQL::Test::Cluster;
-use Test::More tests => 20;
+use Test::More tests => 25;
+use IPC::Run qw(pump finish timer);
 
 program_help_ok('pg_recvlogical');
 program_version_ok('pg_recvlogical');
@@ -106,3 +107,44 @@ $node->command_ok(
         '--start', '--endpos', "$nextlsn", '--no-loop', '-f', '-'
     ],
     'replayed a two-phase transaction');
+
+## Check for a crash bug caused by replication-slot cleanup after
+## pgstat shutdown.
+#fire up an interactive psql session
+my $in  = '';
+my $out = '';
+my $timer = timer(5);
+my $h = $node->interactive_psql('postgres', \$in, \$out, $timer);
+like($out, qr/psql/, "print startup banner");
+
+# open a transaction
+$out = "";
+$in .= "BEGIN;\nCREATE TABLE a (a int);\n";
+pump $h until ($out =~ /CREATE TABLE/ || timer->is_expired);
+ok(!timer->is_expired, 'background CREATE TABLE passed');
+
+# this recvlogical waits for the transaction ends
+ok(open(my $recvlogical, '-|',
+        'pg_recvlogical', '--create-slot', '-S', 'test2',
+        '-d', $node->connstr('postgres')),
+   'launch background pg_recvlogical');
+
+$node->poll_query_until('postgres',
+            qq{SELECT count(*) > 0 FROM pg_stat_activity 
+                        WHERE backend_type='walsender'
+                        AND query like 'CREATE_REPLICATION_SLOT %';});
+# stop server while it hangs. This shouldn't crash server.
+$node->stop;
+ok(open(my $cont, '-|', 'pg_controldata', $node->data_dir),
+   'run pg_controldata');
+my $stop_result = '';
+while (<$cont>)
+{
+    if (/^Database cluster state: *([^ ].*)$/)
+    {
+        $stop_result = $1;
+        last;
+    }
+}
+
+is($stop_result, 'shut down', 'server is properly shut down');

pgsql-hackers by date:

Previous
From: Ashutosh Sharma
Date:
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Next
From: Greg Nancarrow
Date:
Subject: Re: Skipping logical replication transactions on subscriber side