Re: Replication slot drop message is sent after pgstats shutdown. - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: Replication slot drop message is sent after pgstats shutdown. |
Date | |
Msg-id | 20211213.121138.689699294111260502.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: Replication slot drop message is sent after pgstats shutdown. (Masahiko Sawada <sawada.mshk@gmail.com>) |
List | pgsql-hackers |
At Fri, 10 Dec 2021 18:13:31 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in > I agreed with Andres and Horiguchi-san and attached an updated patch. Thanks for the new version. It seems fine, but I have some comments from a cosmetic viewpoint. + /* + * Register callback to make sure cleanup and releasing the replication + * slot on exit. + */ + ReplicationSlotInitialize(); Actually all the function does is that but it looks slightly inconsistent with the function name. I think we can call it just "initialization" here. I think we don't need to change the function comment the same way but either will do for me. +ReplicationSlotBeforeShmemExit(int code, Datum arg) The name looks a bit too verbose. Doesn't just "ReplicationSlotShmemExit" work? - * so releasing here is fine. There's another cleanup in ProcKill() - * ensuring we'll correctly cleanup on FATAL errors as well. + * so releasing here is fine. There's another cleanup in + * ReplicationSlotBeforeShmemExit() callback ensuring we'll correctly + * cleanup on FATAL errors as well. I'd prefer "during before_shmem_exit()" than "in ReplicationSlotBeforeShmemExit() callback" here. (But the current wording is also fine by me.) The attached detects that bug, but I'm not sure it's worth expending test time, or this might be in the server test suit. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/bin/pg_basebackup/t/030_pg_recvlogical.pl b/src/bin/pg_basebackup/t/030_pg_recvlogical.pl index 90da1662e3..0fb4e67697 100644 --- a/src/bin/pg_basebackup/t/030_pg_recvlogical.pl +++ b/src/bin/pg_basebackup/t/030_pg_recvlogical.pl @@ -5,7 +5,8 @@ use strict; use warnings; use PostgreSQL::Test::Utils; use PostgreSQL::Test::Cluster; -use Test::More tests => 20; +use Test::More tests => 25; +use IPC::Run qw(pump finish timer); program_help_ok('pg_recvlogical'); program_version_ok('pg_recvlogical'); @@ -106,3 +107,44 @@ $node->command_ok( '--start', '--endpos', "$nextlsn", '--no-loop', '-f', '-' ], 'replayed a two-phase transaction'); + +## Check for a crash bug caused by replication-slot cleanup after +## pgstat shutdown. +#fire up an interactive psql session +my $in = ''; +my $out = ''; +my $timer = timer(5); +my $h = $node->interactive_psql('postgres', \$in, \$out, $timer); +like($out, qr/psql/, "print startup banner"); + +# open a transaction +$out = ""; +$in .= "BEGIN;\nCREATE TABLE a (a int);\n"; +pump $h until ($out =~ /CREATE TABLE/ || timer->is_expired); +ok(!timer->is_expired, 'background CREATE TABLE passed'); + +# this recvlogical waits for the transaction ends +ok(open(my $recvlogical, '-|', + 'pg_recvlogical', '--create-slot', '-S', 'test2', + '-d', $node->connstr('postgres')), + 'launch background pg_recvlogical'); + +$node->poll_query_until('postgres', + qq{SELECT count(*) > 0 FROM pg_stat_activity + WHERE backend_type='walsender' + AND query like 'CREATE_REPLICATION_SLOT %';}); +# stop server while it hangs. This shouldn't crash server. +$node->stop; +ok(open(my $cont, '-|', 'pg_controldata', $node->data_dir), + 'run pg_controldata'); +my $stop_result = ''; +while (<$cont>) +{ + if (/^Database cluster state: *([^ ].*)$/) + { + $stop_result = $1; + last; + } +} + +is($stop_result, 'shut down', 'server is properly shut down');
pgsql-hackers by date: