Re: Logical replication timeout problem - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Logical replication timeout problem
Date
Msg-id CAA4eK1+4yXeUQ1E=5C8xHN5VpO=_+VKP-QnoKDLi0KWpEE8wSA@mail.gmail.com
Whole thread Raw
In response to Re: Logical replication timeout problem  (Fabrice Chapuis <fabrice636861@gmail.com>)
Responses Re: Logical replication timeout problem
List pgsql-hackers
On Thu, Jan 13, 2022 at 3:43 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
>
> first phase: postgres read WAL files and generate 1420 snap files.
> second phase: I guess, but on this point maybe you can clarify, postgres has to decode the snap files and remove them
ifno statement must be applied on a replicated table.
 
> It is from this point that the worker process exit after 1 minute timeout.
>

Okay, I think the problem could be that because we are skipping all
the changes of transaction there is no communication sent to the
subscriber and it eventually timed out. Actually, we try to send
keep-alive at transaction boundaries like when we call
pgoutput_commit_txn. The pgoutput_commit_txn will call
OutputPluginWrite->WalSndWriteData. I think to tackle the problem we
need to try to send such keepalives via WalSndUpdateProgress and
invoke that in pgoutput_change when we skip sending the change.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Non-decimal integer literals
Next
From: Thomas Munro
Date:
Subject: SLRUs in the main buffer pool, redux