Re: Parallel Apply - Mailing list pgsql-hackers
From | Nisha Moond |
---|---|
Subject | Re: Parallel Apply |
Date | |
Msg-id | CABdArM7z8Pi9bYYSFEzz9Li6+ONSnspXaU0CxVhDmCUZoSagPw@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel Apply (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
Hi, I ran tests to compare the performance of logical synchronous replication with parallel-apply against physical synchronous replication. Highlights =============== On pgHead:(current behavior) - With synchronous physical replication set to remote_apply, the Primary’s TPS drops by ~60% (≈2.5x slower than asynchronous). - With synchronous logical replication set to remote_apply, the Publisher’s TPS drops drastically by ~94% (≈16x slower than asynchronous). With proposed Parallel-Apply Patch(v1): - Parallel apply significantly improves logical synchronous replication performance by 5-6×. - With 40 parallel workers on the subscriber, the Publisher achieves 30045.82 TPS, which is 5.5× faster than the no-patch case (5435.46 TPS). - With the patch, the Publisher’s performance is only ~3x slower than asynchronous, bringing it much closer to the physical replication case. Machine details =============== Intel(R) Xeon(R) CPU E7-4890 v2 @ 2.80GHz CPU(s) :88 cores, - 503 GiB RAM Source code: =============== - pgHead(e9a31c0cc60) and v1 patch Test-01: Physical replication: ====================== - To measure the physical synchronous replication performance on pgHead. Setup & Workload: ----------------- Primary --> Standby - Two nodes created in physical (primary-standby) replication setup. - Default pgbench (read-write) was run on the Primary with scale=300, #clients=40, run duration=20 minutes. - The TPS is measured with the synchronous_commit set as "off" vs "remote_apply" on pgHead. Results: --------- synchronous_commit Primary_TPS regression OFF 90466.57743 - remote_apply(run1) 35848.6558 -60% remote_apply(run2) 35306.25479 -61% - on phHead, when synchronous_commit is set to "remote_apply" during physical replication, the Primary experiences a 60–61% reduction in TPS, which is ~2.5 times slower. ~~~ Test-02: Logical replication: ===================== - To measure the logical synchronous replication performance on pgHead and with parallel-apply patch. Setup & Workload: ----------------- Publisher --> Subscriber - Two nodes created in logical (publisher-subscriber) replication setup. - Default pgbench (read-write) was run on the Pub with scale=300, #clients=40, run duration=20 minutes. - The TPS is measured on pgHead and with the parallel-apply v1 patch. - The number of parallel workers was varied as 2, 4, 8, 16, 32, 40. case-01: pgHead ------------------- Results: synchronous_commit Primary_TPS regression pgHead(OFF) 89138.14626 -- pgHead(remote_apply) 5435.464525 -94% - By default(pgHead), the synchronous logical replication sees a 94% drop in TPS which is - a) 16.4 times slower than the logical async case and, b) 6.6 times slower than physical sync replication case. case-02: patched --------------------- - synchronous_commit = 'remote_apply' - measured the performance by varying #parallel workers as 2, 4, 8, 16, 32, 40 Results: #workers Primary_TPS Improvement_with_patch faster_than_no-patch 2 9679.077736 78% 1.78x 4 14329.64073 164% 2.64x 8 21832.04285 302% 4.02x 16 27676.47085 409% 5.09x 32 29718.40090 447% 5.47x 40 30045.82365 453% 5.53x - The TPS on the publisher improves significantly as the number of parallel workers increases. - At 40 workers, the TPS reaches 30045.82, which is about 5.5x higher than the no-patch case.. - With 40 parallel workers, logical sync replication is only about 1.2x slower than physical sync replication. ~~~ The scripts used for the tests are attached. We'll do tests with larger data sets later and share results. -- Thanks, Nisha
Attachment
pgsql-hackers by date: