Re: Replication slot drop message is sent after pgstats shutdown. - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Replication slot drop message is sent after pgstats shutdown.
Date
Msg-id 20220318072837.GC2739027@rfd.leadboat.com
Whole thread Raw
In response to Re: Replication slot drop message is sent after pgstats shutdown.  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Replication slot drop message is sent after pgstats shutdown.  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Tue, Feb 15, 2022 at 08:58:56AM -0800, Andres Freund wrote:
> Pushed the test yesterday evening, after Tom checked if it is likely to be
> problematic. Seems to worked without problems so far.

 wrasse        │ 2022-02-15 09:29:06 │ HEAD   │
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=wrasse&dt=2022-02-15%2009%3A29%3A06
 flaviventris  │ 2022-02-24 15:17:30 │ HEAD   │
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2022-02-24%2015%3A17%3A30
 calliphoridae │ 2022-03-08 01:14:51 │ HEAD   │
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2022-03-08%2001%3A14%3A51

The buildfarm failed to convey adequate logs for this particular test suite.
Here's regression.diffs from the wrasse case (saved via keep_error_builds):

===
diff -U3 /export/home/nm/farm/studio64v12_6/HEAD/pgsql/contrib/test_decoding/expected/slot_creation_error.out
/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/contrib/test_decoding/output_iso/results/slot_creation_error.out
--- /export/home/nm/farm/studio64v12_6/HEAD/pgsql/contrib/test_decoding/expected/slot_creation_error.out    Tue Feb 15
06:58:142022
 
+++
/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/contrib/test_decoding/output_iso/results/slot_creation_error.out
TueFeb 15 11:38:14 2022
 
@@ -29,16 +29,17 @@
 t                
 (1 row)
 
-step s2_init: <... completed>
-ERROR:  canceling statement due to user request
 step s1_view_slot: 
     SELECT slot_name, slot_type, active FROM pg_replication_slots WHERE slot_name = 'slot_creation_error'
 
-slot_name|slot_type|active
----------+---------+------
-(0 rows)
+slot_name          |slot_type|active
+-------------------+---------+------
+slot_creation_error|logical  |t     
+(1 row)
 
 step s1_c: COMMIT;
+step s2_init: <... completed>
+ERROR:  canceling statement due to user request
 
 starting permutation: s1_b s1_xid s2_init s1_c s1_view_slot s1_drop_slot
 step s1_b: BEGIN;
===

I can make it fail that way by injecting a 1s delay here:

--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3339,6 +3339,7 @@ ProcessInterrupts(void)
          */
         if (!DoingCommandRead)
         {
+            pg_usleep(1 * 1000 * 1000);
             LockErrorCleanup();
             ereport(ERROR,
                     (errcode(ERRCODE_QUERY_CANCELED),

I plan to fix this as attached, similar to how commit c04c767 fixed the same
challenge in detach-partition-concurrently-[34].

Attachment

pgsql-hackers by date:

Previous
From: a.sokolov@postgrespro.ru
Date:
Subject: Re: On login trigger: take three
Next
From: Dongming Liu
Date:
Subject: Re: DSA failed to allocate memory