Replication slot drop message is sent after pgstats shutdown. - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Replication slot drop message is sent after pgstats shutdown. |
Date | |
Msg-id | CAD21AoBgSTF8gp1SKojKRu9dqzN4p1Ob6Mh=QgVhGfLO1NtUYA@mail.gmail.com Whole thread Raw |
Responses |
Re: Replication slot drop message is sent after pgstats shutdown.
|
List | pgsql-hackers |
Hi all, I found another pass where we report stats after the stats collector shutdown. The reproducer and the backtrace I got are here: 1. psql -c "begin; create table a (a int); select pg_sleep(30); commit;" & 2. pg_recvlogical --create-slot -S slot -d postgres & 3. stop the server TRAP: FailedAssertion("pgstat_is_initialized && !pgstat_is_shutdown", File: "pgstat.c", Line: 4752, PID: 62789) 0 postgres 0x000000010a8ed79a ExceptionalCondition + 234 1 postgres 0x000000010a5e03d2 pgstat_assert_is_up + 66 2 postgres 0x000000010a5e1dc4 pgstat_send + 20 3 postgres 0x000000010a5e1d5c pgstat_report_replslot_drop + 108 4 postgres 0x000000010a64c796 ReplicationSlotDropPtr + 838 5 postgres 0x000000010a64c0e9 ReplicationSlotDropAcquired + 89 6 postgres 0x000000010a64bf23 ReplicationSlotRelease + 99 7 postgres 0x000000010a6d60ab ProcKill + 219 8 postgres 0x000000010a6a350c shmem_exit + 444 9 postgres 0x000000010a6a326a proc_exit_prepare + 122 10 postgres 0x000000010a6a3163 proc_exit + 19 11 postgres 0x000000010a8ee665 errfinish + 1109 12 postgres 0x000000010a6e3535 ProcessInterrupts + 1445 13 postgres 0x000000010a65f654 WalSndWaitForWal + 164 14 postgres 0x000000010a65edb2 logical_read_xlog_page + 146 15 postgres 0x000000010a22c336 ReadPageInternal + 518 16 postgres 0x000000010a22b860 XLogReadRecord + 320 17 postgres 0x000000010a619c67 DecodingContextFindStartpoint + 231 18 postgres 0x000000010a65c105 CreateReplicationSlot + 1237 19 postgres 0x000000010a65b64c exec_replication_command + 1180 20 postgres 0x000000010a6e6d2b PostgresMain + 2459 21 postgres 0x000000010a5ef1a9 BackendRun + 89 22 postgres 0x000000010a5ee6fd BackendStartup + 557 23 postgres 0x000000010a5ed487 ServerLoop + 759 24 postgres 0x000000010a5eac22 PostmasterMain + 6610 25 postgres 0x000000010a4c32d3 main + 819 26 libdyld.dylib 0x00007fff73477cc9 start + 1 At step #2, wal sender waits for another transaction started at step #1 to complete after creating the replication slot. When the server is stopping, wal sender process drops the slot on releasing the slot since it's still RS_EPHEMERAL. Then, after dropping the slot we report the message for dropping the slot (see ReplicationSlotDropPtr()). These are executed in ReplicationSlotRelease() called by ProcKill() which is called during calling on_shmem_exit callbacks, which is after shutting down pgstats during before_shmem_exit callbacks. I’ve not tested yet but I think this can potentially happen also when dropping a temporary slot. ProcKill() also calls ReplicationSlotCleanup() to clean up temporary slots. There are some ideas to fix this issue but I don’t think it’s a good idea to move either ProcKill() or the slot releasing code to before_shmem_exit in this case, like we did for other similar issues[1][2]. Reporting the slot dropping message on dropping the slot isn’t necessarily essential actually since autovacuums periodically check already-dropped slots and report to drop the stats. So another idea would be to move pgstat_report_replslot_drop() to a higher layer such as ReplicationSlotDrop() and ReplicationSlotsDropDBSlots() that are not called during callbacks. The replication slot stats are dropped when it’s dropped via commands such as pg_drop_replication_slot() and DROP_REPLICATION_SLOT. On the other hand, for temporary slots and ephemeral slots, we rely on autovacuums to drop their stats. Even if we delay to drop the stats for those slots, pg_stat_replication_slots don’t show the stats for already-dropped slots. Any other ideas? Regards, [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=675c945394b36c2db0e8c8c9f6209c131ce3f0a8 [2] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=dcac5e7ac157964f71f15d81c7429130c69c3f9b -- Masahiko Sawada EDB: https://www.enterprisedb.com/
pgsql-hackers by date: