Re: Control flow in logical replication walsender - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Control flow in logical replication walsender
Date
Msg-id CAExHW5t-t6yM5OhPC0gmxLyiPd1ogfUVB_+-yeFHyLhNixyzqQ@mail.gmail.com
Whole thread Raw
In response to Control flow in logical replication walsender  (Christophe Pettus <xof@thebuild.com>)
Responses Re: Control flow in logical replication walsender
List pgsql-hackers


On Tue, Apr 30, 2024 at 11:28 PM Christophe Pettus <xof@thebuild.com> wrote:

Hi,

I wanted to check my understanding of how control flows in a walsender doing logical replication.  My understanding is that the (single) thread in each walsender process, in the simplest case, loops on:

1. Pull a record out of the WAL.
2. Pass it to the recorder buffer code, which,
3. Sorts it out into the appropriate in-memory structure for that transaction (spilling to disk as required), and then continues with #1, or,
4. If it's a commit record, it iteratively passes the transaction data one change at a time to,
5. The logical decoding plugin, which returns the output format of that plugin, and then,
6. The walsender sends the output from the plugin to the client. It cycles on passing the data to the plugin and sending it to the client until it runs out of changes in that transaction, and then resumes reading the WAL in #1.


This is correct barring some details on master.
 
In particular, I wanted to confirm that while it is pulling the reordered transaction and sending it to the plugin (and thence to the client), that particular walsender is *not* reading new WAL records or putting them in the reorder buffer.


This is correct.
 
The specific issue I'm trying to track down is an enormous pileup of spill files.  This is in a non-supported version of PostgreSQL (v11), so an upgrade may fix it, but at the moment, I'm trying to find a cause and a mitigation.


Is there a large transaction which is failing to be replicated repeatedly - timeouts, crashes on upstream or downstream?

--
Best Wishes,
Ashutosh Bapat

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Support tid range scan in parallel?
Next
From: Alexander Lakhin
Date:
Subject: Re: Removing unneeded self joins