On Tue, Oct 17, 2023 at 09:50:52AM +0530, Amit Kapila wrote:
> On Tue, Oct 17, 2023 at 4:46 AM Callahan, Drew <callaan@amazon.com> wrote:
>> On the server side, we did not see evidence of WALSenders being launched. As a result, the gap kept increasing
further
>> and further since they workers would not transition to the catchup state after several hours due to this.
>
> One possibility is that the system has reached
> 'max_logical_replication_workers' limit due to which it is not
> allowing to launch the apply worker. If so, then consider increasing
> the value of 'max_logical_replication_workers'. You can query
> 'pg_stat_subscription' to know more information about workers. See the
> description of subscriber-side parameters [1].
Hmm. So you basically mean that not being able to launch new workers
prevents the existing workers to move on with their individual sync,
freeing slots once their sync is done for other tables. Then, this
causes all all of the existing workers to remain in a syncwait state,
further increasing the gap in WAL replay. Am I getting that right?
--
Michael