TEST SCENARIO Purpose: To observe that the patch can process cleanups caused by the tablesync kills during DropSubscription Note: The "!!>>" is extra logging added for testing, not a normal part of PG. Steps: 1. CREATE PUBLICATION for some table T1 2. CREATE SUBSCRIPTION for that publication 3. Pause the tablesync worker in CATCHUP state (so there would be a tablesync slot in need of cleanup) 4. Show the slots 5. DROP SUBSCRIPTION // this will signal SIGTERM the tablesync workers 6. Allow the tablesync work to proceed. // See if it gets into the interrupt cleanup function and drops the slots/origin as expected ================== ## ## No slots to start with ## [postgres@CentOS7-x64 ~]$ psql -d test_pub -c "select * from pg_replication_slots;" slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | conf irmed_flush_lsn | wal_status | safe_wal_size -----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+----- ----------------+------------+--------------- (0 rows) ## ## Normal PUBLICATION of a table ## [postgres@CentOS7-x64 ~]$ psql -d test_pub -c "CREATE PUBLICATION tap_pub FOR TABLE test_tab;" CREATE PUBLICATION ## ## Create subscription, and pause the tablesync process when gets to CATCHUP state ## [postgres@CentOS7-x64 ~]$ psql -d test_sub -p 54321 -c "CREATE SUBSCRIPTION tap_sub CONNECTION 'host=localhost dbname=test_pub application_name=tap_sub' PUBLICATION tap_pub;" 2021-01-18 18:15:55.951 AEDT [2068] LOG: logical decoding found consistent point at 0/1614640 2021-01-18 18:15:55.951 AEDT [2068] DETAIL: There are no running transactions. 2021-01-18 18:15:55.951 AEDT [2068] STATEMENT: CREATE_REPLICATION_SLOT "tap_sub" LOGICAL pgoutput NOEXPORT_SNAPSHOT NOTICE: created replication slot "tap_sub" on publisher CREATE SUBSCRIPTION 2021-01-18 18:15:55.975 AEDT [2069] LOG: logical replication apply worker for subscription "tap_sub" has started 2021-01-18 18:15:55.976 AEDT [2069] LOG: !!>> The apply worker process has PID = 2069 [postgres@CentOS7-x64 ~]$ 2021-01-18 18:15:55.984 AEDT [2074] LOG: starting logical decoding for slot "tap_sub" 2021-01-18 18:15:55.984 AEDT [2074] DETAIL: Streaming transactions committing after 0/1614678, reading WAL from 0/1614640. 2021-01-18 18:15:55.984 AEDT [2074] STATEMENT: START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2', publication_names '"tap_pub"') 2021-01-18 18:15:55.984 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:15:55.984 AEDT [2074] LOG: logical decoding found consistent point at 0/1614640 2021-01-18 18:15:55.984 AEDT [2074] DETAIL: There are no running transactions. 2021-01-18 18:15:55.984 AEDT [2074] STATEMENT: START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2', publication_names '"tap_pub"') 2021-01-18 18:15:55.984 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:15:55.988 AEDT [2077] LOG: logical replication table synchronization worker for subscription "tap_sub", table "test_tab" has started 2021-01-18 18:15:55.988 AEDT [2077] LOG: !!>> The tablesync worker process has PID = 2077 2021-01-18 18:15:55.989 AEDT [2077] LOG: !!>> Sleeping 30 secs. For debugging, attach to process 2077 now! 2021-01-18 18:15:55.996 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:15:55.996 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:15:57.006 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:15:57.006 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:15:58.010 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:15:58.010 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:15:59.013 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:15:59.013 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:00.015 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:00.015 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:01.016 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:01.016 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:02.016 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:02.017 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:03.019 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:03.019 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:04.021 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:04.021 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:05.022 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:05.022 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:06.027 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:06.028 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:07.029 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:07.030 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:08.034 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:08.034 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:09.035 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:09.035 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:10.040 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:10.040 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:11.041 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:11.041 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:12.041 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:12.042 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:13.042 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:13.043 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:14.048 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:14.048 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:15.050 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:15.050 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:16.051 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:16.051 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:17.054 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:17.054 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:18.059 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:18.059 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:19.065 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:19.065 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:20.068 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:20.068 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:21.069 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:21.069 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:22.070 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:22.070 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:23.071 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:23.071 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:24.080 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:24.080 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:25.081 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:25.082 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:25.989 AEDT [2077] LOG: !!>> tablesync worker: About to call LogicalRepSyncTableStart to do initial syncing 2021-01-18 18:16:26.085 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:26.085 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:26.085 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:26.085 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:27.086 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:27.087 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:28.089 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:28.089 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:29.094 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:29.094 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:30.101 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:30.102 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:31.103 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:31.103 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:32.104 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:32.104 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:33.105 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:33.105 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:34.113 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:34.113 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:35.115 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:35.115 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:36.130 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:36.130 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:37.136 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:37.136 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:38.138 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:38.138 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:39.142 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:39.142 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:40.144 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:40.144 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:41.146 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:41.146 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:42.147 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:42.147 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:43.149 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:43.149 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:44.154 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:44.154 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:45.158 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:45.158 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:46.160 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:46.160 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:46.714 AEDT [2077] LOG: !!>> LogicalRepSyncTableStart: walrcv_create_slot for "pg_16394_sync_16385". 2021-01-18 18:16:46.760 AEDT [3556] LOG: logical decoding found consistent point at 0/1614678 2021-01-18 18:16:46.760 AEDT [3556] DETAIL: There are no running transactions. 2021-01-18 18:16:46.760 AEDT [3556] STATEMENT: CREATE_REPLICATION_SLOT "pg_16394_sync_16385" LOGICAL pgoutput USE_SNAPSHOT 2021-01-18 18:16:46.781 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:46.781 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables 2021-01-18 18:16:46.789 AEDT [2077] LOG: !!>> LogicalRepSyncTableStart: 1 replorigin_create "pg_16394_16385". 2021-01-18 18:16:46.789 AEDT [2077] LOG: !!>> LogicalRepSyncTableStart: 1 replorigin_advance "pg_16394_16385". 2021-01-18 18:16:46.789 AEDT [2077] LOG: !!>> LogicalRepSyncTableStart: 1 replorigin_session_setup "pg_16394_16385". 2021-01-18 18:16:46.789 AEDT [2077] LOG: LogicalRepSyncTableStart: 'pg_16394_16385' origin_startpos lsn 0/16146B0 2021-01-18 18:16:46.789 AEDT [2077] LOG: !!>> tablesync worker: wait for CATCHUP state notification 2021-01-18 18:16:46.789 AEDT [2069] LOG: !!>> apply worker: LogicalRepApplyLoop 2021-01-18 18:16:46.789 AEDT [2069] LOG: !!>> apply worker: called process_syncing_tables ## ## Show the slots ## psql -d test_pub -c "select * from pg_replication_slots;" slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restar t_lsn | confirmed_flush_lsn | wal_status | safe_wal_size ---------------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+------- ------+---------------------+------------+--------------- tap_sub | pgoutput | logical | 16384 | test_pub | f | t | 2074 | | 512 | 0/1614 6B0 | 0/16146B0 | reserved | pg_16394_sync_16385 | pgoutput | logical | 16384 | test_pub | f | f | | | 512 | 0/1614 678 | 0/16146B0 | reserved | (2 rows) ## ## DROP the subscription (while tablesync still paused) ## [postgres@CentOS7-x64 ~]$ psql -d test_sub -p 54321 -c "DROP SUBSCRIPTION tap_sub;" 2021-01-18 18:38:05.069 AEDT [2069] LOG: !!>> ProcessInterrupts: Hello, I am a LogicalWorker 2021-01-18 18:38:05.069 AEDT [2069] FATAL: terminating logical replication worker due to administrator command 2021-01-18 18:38:05.070 AEDT [30060] LOG: background worker "logical replication worker" (PID 2069) exited with exit code 1 ## ## Continue... from the paused tablesync worker ## ## PID 2077 was the tablesync worker ## PID 2069 was the Apply worker ## 2021-01-18 18:38:40.549 AEDT [2077] LOG: !!>> tablesync worker: received CATCHUP state notification 2021-01-18 18:38:40.549 AEDT [2077] LOG: !!>> ProcessInterrupts: Hello, I am a LogicalWorker 2021-01-18 18:38:56.096 AEDT [2077] LOG: !!>> tablesync_cleanup_at_interrupt for relid = 16385 (gdb) p *MyLogicalRepWorker $3 = {launch_time = 664269355984880, in_use = true, generation = 1, proc = 0x7ff208071180, dbid = 16384, userid = 10, subid = 16394, relid = 16385, relstate = 99 'c', relstate_lsn = 23152304, relmutex = 0 '\000', last_lsn = 0, last_send_time = 664269355985870, last_recv_time = 664269355985870, reply_lsn = 0, reply_time = 664269355985870} (gdb) n 298 if (drop_slot_needed) (gdb) p drop_slot_needed $4 = true 2021-01-18 18:41:09.274 AEDT [2077] LOG: !!>> tablesync_cleanup_at_interrupt: dropping the tablesync slot "pg_16394_sync_16385". 2021-01-18 18:41:11.549 AEDT [2077] LOG: !!>> tablesync_cleanup_at_interrupt: dropped the tablesync slot "pg_16394_sync_16385". 2021-01-18 18:41:17.290 AEDT [2077] LOG: !!>> tablesync_cleanup_at_interrupt: dropping origin tracking for "pg_16394_16385" 2021-01-18 18:41:19.146 AEDT [2077] LOG: !!>> tablesync_cleanup_at_interrupt: dropped origin tracking for "pg_16394_16385" 2021-01-18 18:41:23.862 AEDT [2077] LOG: logical replication table synchronization worker for subscription "tap_sub", table "test_tab" has finished NOTICE: dropped replication slot "tap_sub" on publisher DROP SUBSCRIPTION [postgres@CentOS7-x64 ~]$ ## ## Show the slots & origin ## [postgres@CentOS7-x64 ~]$ psql -d test_pub -c "select * from pg_replication_slots;" slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | conf irmed_flush_lsn | wal_status | safe_wal_size -----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+----- ----------------+------------+--------------- (0 rows) [postgres@CentOS7-x64 ~]$ psql -d test_sub -p 54321 -c "select * from pg_replication_origin;" roident | roname ---------+-------- (0 rows) [END]