Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table |
Date | |
Msg-id | CAD21AoB6MJfRxMJpLSEvqMyigU9BSP5aMBYG28QpbcW2C1X8FA@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table
|
List | pgsql-hackers |
On Wed, May 10, 2017 at 2:46 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > On Mon, May 8, 2017 at 8:42 PM, Petr Jelinek > <petr.jelinek@2ndquadrant.com> wrote: >> On 08/05/17 11:27, Masahiko Sawada wrote: >>> Hi, >>> >>> I encountered a situation where DROP SUBSCRIPTION got stuck when >>> initial table sync is in progress. In my environment, I created >>> several tables with some data on publisher. I created subscription on >>> subscriber and drop subscription immediately after that. It doesn't >>> always happen but I often encountered it on my environment. >>> >>> ps -x command shows the following. >>> >>> 96796 ? Ss 0:00 postgres: masahiko postgres [local] DROP >>> SUBSCRIPTION >>> 96801 ? Ts 0:00 postgres: bgworker: logical replication >>> worker for subscription 40993 waiting >>> 96805 ? Ss 0:07 postgres: bgworker: logical replication >>> worker for subscription 40993 sync 16418 >>> 96806 ? Ss 0:01 postgres: wal sender process masahiko [local] idle >>> 96807 ? Ss 0:00 postgres: bgworker: logical replication >>> worker for subscription 40993 sync 16421 >>> 96808 ? Ss 0:00 postgres: wal sender process masahiko [local] idle >>> >>> The DROP SUBSCRIPTION process (pid 96796) is waiting for the apply >>> worker process (pid 96801) to stop while holding a lock on >>> pg_subscription_rel. On the other hand the apply worker is waiting for >>> acquiring a tuple lock on pg_subscription_rel needed for heap_update. >>> Also table sync workers (pid 96805 and 96807) are waiting for the >>> apply worker process to change their status. >>> >> >> Looks like we should kill apply before dropping dependencies. > > Sorry, after investigated I found out that DROP SUBSCRIPTION process > is holding AccessExclusiveLock on pg_subscription (, not > pg_subscription_rel) and apply worker is waiting for acquiring a lock > on it. Hmm it seems there are two cases. One is that the apply worker waits to acquire AccessShareLock on pg_subscription but DropSubscription already acquired AcessExclusiveLock on it and waits for the apply worker to finish. Another case is that the apply worker waits to acquire a tuple lock on pg_subscrption_rel but DropSubscription (maybe droppoing dependencies) already acquired it. > So I guess that the dropping dependencies are not relevant with > this. It seems to me that the main cause is that DROP SUBSCRIPTION > waits for apply worker to finish while keeping to hold > AccessExclusiveLock on pg_subscription. Perhaps we need to contrive > ways to reduce lock level somehow. > >> >>> Also, even when DROP SUBSCRIPTION is done successfully, the table sync >>> worker can be orphaned because I guess that the apply worker can exit >>> before change status of table sync worker. >> >> Well the tablesync worker should stop itself if the subscription got >> removed, but of course again the dependencies are an issue, so we should >> probably kill those explicitly as well. > > Yeah, I think that we should ensure that the apply worker exits after > killed all involved table sync workers. > Barring any objections, I'll add these two issues to open item. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: