Re: [HACKERS] logical replication deranged sender - Mailing list pgsql-hackers

From Petr Jelinek
Subject Re: [HACKERS] logical replication deranged sender
Date
Msg-id bb4ea17b-82a4-6794-f2f1-02740c8e7bcb@2ndquadrant.com
Whole thread Raw
In response to Re: [HACKERS] logical replication deranged sender  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: [HACKERS] logical replication deranged sender  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On 09/05/17 19:13, Jeff Janes wrote:
> On Tue, May 9, 2017 at 9:18 AM, Petr Jelinek
> <petr.jelinek@2ndquadrant.com <mailto:petr.jelinek@2ndquadrant.com>> wrote:
> 
>     On 08/05/17 13:47, Petr Jelinek wrote:
>     > On 08/05/17 01:17, Jeff Janes wrote:
>     >> After dropping a subscription, it says it succeeded and that it dropped
>     >> the slot on the publisher.
>     >>
>     >> But the publisher still has the slot, and a full-tilt process described
>     >> by ps as
>     >>
>     >> postgres: wal sender process jjanes [local] idle in transaction
>     >>
>     >> Strace shows that this process is doing nothing but opening, reading,
>     >> lseek, and closing from pg_wal, and calling sbrk.  It never sends anything.
>     >>
>     >> This is not how it should work, correct?
>     >>
>     >
>     > No, and I don't see how this happens though, we only report success if
>     > the publisher side said that DROP_REPLICATION_SLOT succeeded. So far I
>     > don't see anything in source that would explain this. I will need to
>     > reproduce it first to see what's happening (wasn't able to do that yet,
>     > but it might just need more time since you say it does no happen always).
>     >
> 
>     Hm I wonder are there any workers left on subscriber when this happens?
> 
> 
> Yes.  using ps, I get this:
> 
> postgres: bgworker: logical replication worker for subscription 16408
> sync 16391
> postgres: bgworker: logical replication worker for subscription 16408
> sync 16388
> 
> They seem to be permanently blocked on a socket to read from the publisher.
> 
> On the publisher side, I think it is very slowly assembling a snapshot. 
> It seems to be adding one xid at a time, and then re-sorting the entire
> list.  Over and over.
> 

Okay, then it's the same issue Masahiko Sawada reported in nearby
thread, or at least has same cause.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: [HACKERS] proposal psql \gdesc
Next
From: Erez Segal
Date:
Subject: [HACKERS] COMPRESS VALUES feature request