Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table
Date
Msg-id CAD21AoChkuWzVbry35zn6vMyLo0ff6kTzEkPGOrrSH1qpr9QkQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Responses Re: [HACKERS] Get stuck when dropping a subscription duringsynchronizing table  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Mon, May 8, 2017 at 8:42 PM, Petr Jelinek
<petr.jelinek@2ndquadrant.com> wrote:
> On 08/05/17 11:27, Masahiko Sawada wrote:
>> Hi,
>>
>> I encountered a situation where DROP SUBSCRIPTION got stuck when
>> initial table sync is in progress. In my environment, I created
>> several tables with some data on publisher. I created subscription on
>> subscriber and drop subscription immediately after that. It doesn't
>> always happen but I often encountered it on my environment.
>>
>> ps -x command shows the following.
>>
>>  96796 ?        Ss     0:00 postgres: masahiko postgres [local] DROP
>> SUBSCRIPTION
>>  96801 ?        Ts     0:00 postgres: bgworker: logical replication
>> worker for subscription 40993    waiting
>>  96805 ?        Ss     0:07 postgres: bgworker: logical replication
>> worker for subscription 40993 sync 16418
>>  96806 ?        Ss     0:01 postgres: wal sender process masahiko [local] idle
>>  96807 ?        Ss     0:00 postgres: bgworker: logical replication
>> worker for subscription 40993 sync 16421
>>  96808 ?        Ss     0:00 postgres: wal sender process masahiko [local] idle
>>
>> The DROP SUBSCRIPTION process (pid 96796) is waiting for the apply
>> worker process (pid 96801) to stop while holding a lock on
>> pg_subscription_rel. On the other hand the apply worker is waiting for
>> acquiring a tuple lock on pg_subscription_rel needed for heap_update.
>> Also table sync workers (pid 96805 and 96807) are waiting for the
>> apply worker process to change their status.
>>
>
> Looks like we should kill apply before dropping dependencies.

Sorry, after investigated I found out that DROP SUBSCRIPTION process
is holding AccessExclusiveLock on pg_subscription (, not
pg_subscription_rel) and apply worker is waiting for acquiring a lock
on it. So I guess that the dropping dependencies are not relevant with
this.  It seems to me that the main cause is that DROP SUBSCRIPTION
waits for apply worker to finish while keeping to hold
AccessExclusiveLock on pg_subscription. Perhaps we need to contrive
ways to reduce lock level somehow.

>
>> Also, even when DROP SUBSCRIPTION is done successfully, the table sync
>> worker can be orphaned because I guess that the apply worker can exit
>> before change status of table sync worker.
>
> Well the tablesync worker should stop itself if the subscription got
> removed, but of course again the dependencies are an issue, so we should
> probably kill those explicitly as well.

Yeah, I think that we should ensure that the apply worker exits after
killed all involved table sync workers.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Mark Dilger
Date:
Subject: Re: [HACKERS] idea: custom log_line_prefix components besides application_name
Next
From: Erik Rijkers
Date:
Subject: Re: [HACKERS] snapbuild woes