On 2020-Nov-26, Fujii Masao wrote:
> Yes, so the problem here is that walsender goes into the busy loop
> in that case. Seems this happens only in logical replication walsender.
> In physical replication walsender, WaitLatchOrSocket() in WalSndLoop()
> seems to work as expected and prevent it from entering into busy loop
> even in that case.
>
> /*
> * If postmaster asked us to stop, don't wait anymore.
> *
> * It's important to do this check after the recomputation of
> * RecentFlushPtr, so we can send all remaining data before shutting
> * down.
> */
> if (got_STOPPING)
> break;
>
> The above code in WalSndWaitForWal() seems to cause this issue. But I've
> not come up with idea about how to fix yet.
With DEBUG1 I observe that walsender is getting a lot of 'r' messages
(standby reply) with all zeroes:
2020-12-01 21:01:24.100 -03 [15307] DEBUG: write 0/0 flush 0/0 apply 0/0
However, while doing that I also observed that if I do send some
activity to the logical replication stream, with the provided program,
it will *still* have the 'write' pointer set to 0/0, and the 'flush'
pointer has moved forward to what was sent. I'm not clear on what
causes the write pointer to move forward in logical replication.
Still, the previously proposed patch does resolve the problem in either
case.