Re: BUG #18155: Logical Apply Worker Timeout During TableSync Causes Either Stuckness or Data Loss - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: BUG #18155: Logical Apply Worker Timeout During TableSync Causes Either Stuckness or Data Loss
Date
Msg-id ZS8zRgKfB7AcxJWv@paquier.xyz
Whole thread Raw
In response to Re: BUG #18155: Logical Apply Worker Timeout During TableSync Causes Either Stuckness or Data Loss  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: BUG #18155: Logical Apply Worker Timeout During TableSync Causes Either Stuckness or Data Loss  (Michael Paquier <michael@paquier.xyz>)
List pgsql-bugs
On Tue, Oct 17, 2023 at 09:50:52AM +0530, Amit Kapila wrote:
> On Tue, Oct 17, 2023 at 4:46 AM Callahan, Drew <callaan@amazon.com> wrote:
>> On the server side, we did not see evidence of WALSenders being launched. As a result, the gap kept increasing
further
>> and further since they workers would not transition to the catchup state after several hours due to this.
>
> One possibility is that the system has reached
> 'max_logical_replication_workers' limit due to which it is not
> allowing to launch the apply worker. If so, then consider increasing
> the value of 'max_logical_replication_workers'. You can query
> 'pg_stat_subscription' to know more information about workers. See the
> description of subscriber-side parameters [1].

Hmm.  So you basically mean that not being able to launch new workers
prevents the existing workers to move on with their individual sync,
freeing slots once their sync is done for other tables.  Then, this
causes all all of the existing workers to remain in a syncwait state,
further increasing the gap in WAL replay.  Am I getting that right?
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: Error “Unable to Write Inside Temp Environment Variable Path”
Next
From: Richard Guo
Date:
Subject: Re: Assert failure when CREATE TEMP TABLE