[HACKERS] Get stuck when dropping a subscription during synchronizing table - Mailing list pgsql-hackers

From Masahiko Sawada
Subject [HACKERS] Get stuck when dropping a subscription during synchronizing table
Date
Msg-id CAD21AoBYpyqTSw+=ES+xXtRGMPKh=pKiqjNxZKnNUae0pSt9bg@mail.gmail.com
Whole thread Raw
Responses Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table  (Erik Rijkers <er@xs4all.nl>)
Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Re: [HACKERS] Get stuck when dropping a subscription during synchronizing table  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
Hi,

I encountered a situation where DROP SUBSCRIPTION got stuck when
initial table sync is in progress. In my environment, I created
several tables with some data on publisher. I created subscription on
subscriber and drop subscription immediately after that. It doesn't
always happen but I often encountered it on my environment.

ps -x command shows the following.
96796 ?        Ss     0:00 postgres: masahiko postgres [local] DROP
SUBSCRIPTION96801 ?        Ts     0:00 postgres: bgworker: logical replication
worker for subscription 40993    waiting96805 ?        Ss     0:07 postgres: bgworker: logical replication
worker for subscription 40993 sync 1641896806 ?        Ss     0:01 postgres: wal sender process masahiko [local]
idle96807?        Ss     0:00 postgres: bgworker: logical replication
 
worker for subscription 40993 sync 1642196808 ?        Ss     0:00 postgres: wal sender process masahiko [local] idle

The DROP SUBSCRIPTION process (pid 96796) is waiting for the apply
worker process (pid 96801) to stop while holding a lock on
pg_subscription_rel. On the other hand the apply worker is waiting for
acquiring a tuple lock on pg_subscription_rel needed for heap_update.
Also table sync workers (pid 96805 and 96807) are waiting for the
apply worker process to change their status.

Also, even when DROP SUBSCRIPTION is done successfully, the table sync
worker can be orphaned because I guess that the apply worker can exit
before change status of table sync worker.

I'm using 1f30295.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: [HACKERS] May cause infinite loop when initializing rel-cachecontains partitioned table
Next
From: Petr Jelinek
Date:
Subject: Re: [HACKERS] Draft release notes for next week's back-branchreleases