Re: [HACKERS] Apparent walsender bug triggered by logical replication - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Apparent walsender bug triggered by logical replication
Date
Msg-id 20499.1498790790@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Apparent walsender bug triggered by logical replication  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Responses Re: [HACKERS] Apparent walsender bug triggered by logical replication  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
List pgsql-hackers
Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:
> On 30/06/17 02:07, Tom Lane wrote:
>> I'm also kind of wondering why the "behind the apply" path out of
>> LogicalRepSyncTableStart exists at all; as far as I can tell we'd be much
>> better off if we just let the sync worker exit always as soon as it's done
>> the initial sync, letting any extra catchup happen later.  The main thing
>> the current behavior seems to be accomplishing is to monopolize one of the
>> scarce max_sync_workers_per_subscription slots for the benefit of a single
>> table, for longer than necessary.  Plus it adds additional complicated
>> interprocess signaling.

> Hmm, I don't understand what you mean here. The "letting any extra
> catchup happen later" would never happen if the sync is behind apply as
> apply has already skipped relevant transactions.

Once the sync worker has exited, we have to have some other way of dealing
with that.  I'm wondering why we can't let that other way take over
immediately.  The existing approach is inefficient, according to the
traces I've been poring over all day, and frankly I am very far from
convinced that it's bug-free either.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Petr Jelinek
Date:
Subject: Re: [HACKERS] Apparent walsender bug triggered by logical replication
Next
From: Craig Ringer
Date:
Subject: Re: protocol version negotiation (Re: [HACKERS] Libpq PGRES_COPY_BOTH- version compatibility)