Re: [HACKERS] logical replication deranged sender - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: [HACKERS] logical replication deranged sender
Date
Msg-id CAMkU=1zrXf8=TxpO+oXhT47UYVVyqHn6G2iq2QDbFr68FsYtkA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] logical replication deranged sender  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Responses Re: [HACKERS] logical replication deranged sender  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
List pgsql-hackers
On Tue, May 9, 2017 at 9:18 AM, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote:
On 08/05/17 13:47, Petr Jelinek wrote:
> On 08/05/17 01:17, Jeff Janes wrote:
>> After dropping a subscription, it says it succeeded and that it dropped
>> the slot on the publisher.
>>
>> But the publisher still has the slot, and a full-tilt process described
>> by ps as
>>
>> postgres: wal sender process jjanes [local] idle in transaction
>>
>> Strace shows that this process is doing nothing but opening, reading,
>> lseek, and closing from pg_wal, and calling sbrk.  It never sends anything.
>>
>> This is not how it should work, correct?
>>
>
> No, and I don't see how this happens though, we only report success if
> the publisher side said that DROP_REPLICATION_SLOT succeeded. So far I
> don't see anything in source that would explain this. I will need to
> reproduce it first to see what's happening (wasn't able to do that yet,
> but it might just need more time since you say it does no happen always).
>

Hm I wonder are there any workers left on subscriber when this happens?

Yes.  using ps, I get this:

postgres: bgworker: logical replication worker for subscription 16408 sync 16391
postgres: bgworker: logical replication worker for subscription 16408 sync 16388

They seem to be permanently blocked on a socket to read from the publisher.

On the publisher side, I think it is very slowly assembling a snapshot.  It seems to be adding one xid at a time, and then re-sorting the entire list.  Over and over.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [HACKERS] MSVC odd TAP test problem
Next
From: Mark Dilger
Date:
Subject: Re: [HACKERS] idea: custom log_line_prefix components besides application_name