Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers
From | Nisha Moond |
---|---|
Subject | Re: Conflict detection for update_deleted in logical replication |
Date | |
Msg-id | CABdArM60vgergEwe-dT1CMwmi9Yr-8=nnU1XQ9btkFveLWm0sw@mail.gmail.com Whole thread Raw |
In response to | Re: Conflict detection for update_deleted in logical replication (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
Hi, As per the test results in [1], the TPS drop observed on the subscriber with update_deleted enabled was mainly because only a single apply worker was handling the replication workload from multiple concurrent publisher clients. The following performance benchmarks were conducted to evaluate the improvements using parallel apply when update_deleted (retain_dead_tuples) is enabled, under heavy workloads, without leveraging row filters or multiple subscriptions to distribute the load. Note: The earlier tests from[1] are repeated with few workload modifications to see the improvements using parallel-apply. Highlights =============== - No regression was observed when running pgbench individually on either Pub or Sub nodes. - When pgbench was run on both Pub and Sub, performance improved significantly with the parallel apply patch. With just 4 workers, Sub was able to catch up with Pub without regression. - With max_conflict_retention_duration=60s, retention on Sub was not stopped when using 4 or more parallel workers. Machine details =============== Intel(R) Xeon(R) CPU E7-4890 v2 @ 2.80GHz CPU(s) :88 cores, - 503 GiB RAM Source code: =============== - pgHead(e9a31c0cc60) and parallel-apply v1 patch[2] - additionally used v64-0001 of update_deleted for max_conflict_retention_duration related tests. Test-01: pgbench on publisher ============================ Setup: --------- Pub --> Sub - Two nodes created in pub-sub logical replication setup. - Both nodes have the same set of pgbench tables created with scale=60. - The Sub node is subscribed to all the changes from the Pub's pgbench tables and the subscription has retain_dead_tuples = on Workload: -------------- - Run default pgbench(read-write) only on Pub with #clients=40 and run duration=10 minutes Results: ------------ #run pgHead_TPS pgHead+v1_patch_TPS 1 41135.71241 39922.7163 2 40466.23865 39980.29034 3 40578.16374 39867.44929 median 40578.16374 39922.7163 - No regression. ~~~ Test-02: pgbench on subscriber ======================== Setup: same as Test-01 Workload: -------------- - Run default pgbench(read-write) only on Sub node with #clients=40 and run duration=10 minutes Results: ----------- #run pgHead_TPS pgHead+v1_patch_TPS 1 42173.90504 42071.18058 2 41836.10027 41839.84837 3 41921.81233 41494.9918 median 41921.81233 41839.84837 - No regression. ~~~ Test-03: pgbench on both sides ======================== Setup: ------ Pub --> Sub - Two nodes created in a pub-sub logical replication setup. - Both nodes have different sets of pgbench tables created with scale=60. - The sub node also has Pub's pgbench tables and is subscribed to all the changes. Workload: -------------- - Run default pgbench(read-write) on both Pub and Sub nodes for their respective pgbench tables - Both pgbench runs are with #clients=15 and duration=10 minutes Observations: -------------- - On pgHead when retain_dead_tuples=ON, the Sub's TPS reduced by ~76% - With the parallel apply patch, performance improves significantly as parallel workers increase, since conflict_slot.xmin advances more quickly. - With just 4 parallel workers, subscription TPS matches the baseline (no regression). - Performance remains consistent at 8 and 16 workers. Detailed results: ------------------ case-1: - The base case pgHead(e9a31c0cc60) and retain_dead_tuples=OFF #run Pub_tps Sub_tps 1 17140.08647 16994.63269 2 17421.28513 17445.62828 3 17167.57801 17070.86979 median 17167.57801 17070.86979 case-2: - pgHead(e9a31c0cc60) and retain_dead_tuples=ON #run Pub_tps Sub_tps 1 18667.29343 4135.884924 2 18200.90297 4178.713784 3 18309.87093 4227.330234 median 18309.87093 4178.713784 - The Sub sees a ~76% of TPS reduction by default on head. case-3: - pgHead(e9a31c0cc60)+ v1_parallel_apply_patch and retain_dead_tuples=ON - number of parallel apply workers varied as 2,4,8,16 3a) #workers=2 #run Pub_tps Sub_tps 1 18336.98565 4244.072357 2 18629.96658 4231.887288 3 18152.92036 4253.293648 median 18336.98565 4244.072357 - There is no significant TPS improvement with 2 parallel workers, ~76% TPS reduction 3b) #workers=4 #run Pub_tps Sub_tps 1 16796.49468 16850.05362 2 16834.06057 16757.73115 3 16647.78486 16762.9107 median 16796.49468 16762.9107 - No regression 3c) #workers=8 #run Pub_tps Sub_tps 1 17105.38794 16778.38209 2 16783.5085 16780.20492 3 16806.97569 16642.87521 median 16806.97569 16778.38209 - No regression 3d) #workers=16 #run Pub_tps Sub_tps 1 16827.20615 16770.92293 2 16860.10188 16745.2968 3 16808.2148 16668.47974 median 16827.20615 16745.2968 - No regression. ~~~ Test-04. pgbench on both side, and max_conflict_retention_duration was tuned ======================================================================== Setup: ------- Pub --> Sub - setup is same as Test-03(above) - Additionally, subscription option max_conflict_retention_duration=60s Workload: ------------- - Run default pgbench(read-write) on both Pub and Sub nodes for their respective pgbench tables - Started with 15 clients on both sides. - When conflict_slot.xmin becomes NULL on Sub, pgbench was paused to let the subscription catch up. Then reduced publisher clients by half and resumed pgbench. Here, slot.xmin becomes NULL to indicate conflict retention is stopped under high publisher load but stays non-NULL when Sub is able to catchup with Pub's load. - Total duration of pgbench run is 10 minutes (600s). Observations: ------------------ - Without the parallel apply patch, publisher clients reduced from 15->7->3,and finally the retention was not stopped at 3 clients and slot.xmin remained non-NULL. - With the parallel apply patch, using 2 workers the subscription handled up to 7 publisher clients without stopping the conflict retention. - With 4+ workers, retention continued for the full 10 minutes and Sub TPS showed no regression. Detailed results: ----------------- case-1: - pgHead(e9a31c0cc60) + v64-001 and retain_dead_tuples=ON On publisher: #cleints durations[s] TPS 15 73 17953.52 7 100 9141.9 3 426 4286.381132 On Subscriber: #cleints durations[s] TPS 15 73 10626.67 15 99 10271.35 15 431 19467.07612 ~~~ case-2: - pgHead(e9a31c0cc60) + v64-001 + v1_parallel-apply patch[2] and retain_dead_tuples=ON - number of parallel apply workers varied as 2,4,8 2a) #workers=2 On publisher: #cleints durations[s] TPS 15 87 17318.3 7 512 9063.506025 On Subscriber: #cleints durations[s] TPS 15 87 10299.66 15 512 18426.44818 2b) #workers=4 On publisher: #cleints durations[s] TPS 15 600 16953.40302 On Subscriber: #cleints durations[s] TPS 15 600 16812.15289 2c) #workers=8 On publisher: #cleints durations[s] TPS 15 600 16946.91636 On Subscriber: #cleints durations[s] TPS 15 600 16708.12774 ~~~~ The scripts used for all the tests are attached. [1] https://www.postgresql.org/message-id/OSCPR01MB1496663AED8EEC566074DFBC9F54CA%40OSCPR01MB14966.jpnprd01.prod.outlook.com [2] https://www.postgresql.org/message-id/OS0PR01MB5716D43CB68DB8FFE73BF65D942AA%40OS0PR01MB5716.jpnprd01.prod.outlook.com -- Thanks, Nisha
Attachment
pgsql-hackers by date: