Re: Logical replication prefetch - Mailing list pgsql-hackers
From | Konstantin Knizhnik |
---|---|
Subject | Re: Logical replication prefetch |
Date | |
Msg-id | 6faa3037-609e-4cd2-a4f8-c97fc5cb390b@garret.ru Whole thread Raw |
In response to | Re: Logical replication prefetch (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
On 13/07/2025 9:28 am, Amit Kapila wrote: > I didn't understand your scenario. pa_launch_parallel_worker() should > spawn a new worker only if all the workers in the pool are busy, and > then it will free the worker if the pool already has enough workers. > So, do you mean to say that the workers in the pool are always busy in > your workload which lead spawn/exit of new workers? Can you please > explain your scenario in some more detail? > Current LR apply logic is not working well for applying small OLTP transactions. First of all by default reorder buffer at publisher will buffer them and so prevent parallel apply at subscriber. Publisher switches to streaming mode only if transaction is too large or `debug_logical_replication_streaming=immediate`. But even if we force publisher to stream short transactions, subscriber will try to launch new parallel apply worker for each transactions (if all existed workers are busy). If there are 100 active backends at publisher, then subscriber will try to launch 100 parallel apply workers. Most likely it fails because of limit for maximal number of workers. In this case leader will serialize such transactions. So if there are 100 streamed transactions and 10 parallel apply workers, then 10 transactions are started in parallel and 90 will be serialized to disk. It seems to be not so efficient for short transaction. It is better to wait for some time until some of workers become vacant. But the worst thing happen when parallel apply worker completes its transactions. If number of parallel apply workers in pool exceeds `max_parallel_apply_workers_per_subscription / 2`, then this parallel apply worker is terminated. So instead of having `max_parallel_apply_workers_per_subscription` workers applying transactions at maximal possible speed and leader which distributes transaction between them and stops receiving new data from publisher if there is no vacant worker, we will have leader serializing and writing transactions to the disk (and then definitely reading them from the disk) and permanently starting and terminating parallel apply worker processes. It leads to awful performance. Certainly originally intended use case was different: parallel apply is performed only for large transactions. Number of of such transactions is not so big and so there should be enough parallel apply workers in pool to proceed them. And if there are not enough workers, it is not a problem to spawn new one and terminate it after completion of transaction (because transaction is long, overhead of spawning process is not so larger comparing with redo of large transaction). But if we want to efficiently replicate OLTP workload, then we definitely need some other approach. Prefetch is actually more compatible with current implementation because prefetch operations don't need to be grouped by transaction and can be executed by any prefetch worker.
pgsql-hackers by date: