Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication |
Date | |
Msg-id | CALDaNm1SoOn_7gGTTq0L-t-eJehjRTXh2io+BYEY=SOSDAq6jQ@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication (Peter Smith <smithpb2250@gmail.com>) |
List | pgsql-hackers |
On Tue, 11 Jul 2023 at 08:30, Peter Smith <smithpb2250@gmail.com> wrote: > > On Tue, Jul 11, 2023 at 12:31 AM Melih Mutlu <m.melihmutlu@gmail.com> wrote: > > > > Hi, > > > > Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>, 6 Tem 2023 Per, > > 12:47 tarihinde şunu yazdı: > > > > > > Dear Melih, > > > > > > > Thanks for the 0003 patch. But it did not work for me. Can you create > > > > a subscription successfully with patch 0003 applied? > > > > I get the following error: " ERROR: table copy could not start > > > > transaction on publisher: another command is already in progress". > > > > > > You got the ERROR when all the patches (0001-0005) were applied, right? > > > I have focused on 0001 and 0002 only, so I missed something. > > > If it was not correct, please attach the logfile and test script what you did. > > > > Yes, I did get an error with all patches applied. But with only 0001 > > and 0002, your version seems like working and mine does not. > > What do you think about combining 0002 and 0003? Or should those stay separate? > > > > Even if patches 0003 and 0002 are to be combined, I think that should > not happen until after the "reuse" design is confirmed which way is > best. > > e.g. IMO it might be easier to compare the different PoC designs for > patch 0002 if there is no extra logic involved. > > PoC design#1 -- each tablesync decides for itself what to do next > after it finishes > PoC design#2 -- reuse tablesync using a "pool" of available workers I did a POC for design#2 for implementing a worker pool to synchronize the tables for a subscriber. The core design is the same as what Melih had implemented at [1]. I had already started the implementation of POC based on one of the earlier e-mail [2] Peter had shared. The POC has been implemented like: a) Apply worker will check the tablesync pool and see if any tablesync worker is free: i) If there are no free workers in the pool, start a table sync worker and add it to the table sync pool. ii) If there are free workers in the pool, re-use the tablesync worker for synchronizing another table. b) Apply worker will check if the tables are synchronized, if all the tables are synchronized apply worker will release all the workers from the tablesync pool c) Apply worker and tablesync worker has shared memory to share the following relation data and execution state between the apply worker and the tablesync worker d) The apply worker and tablesync worker's pid are also stored in the shared memory so that we need not take a lock on LogicalRepWorkerLock and loop on max_logical_replication_workers every time. We use the pid stored in shared memory to wake up the apply worker and tablesync worker whenever needed. While I was implementing the POC I found one issue in the POC patch(there is no problem with the HEAD code, issue was only with the POC): 1) Apply worker was waiting for the table to be set to SYNCDONE. 2) Mean time tablesync worker sets the table to SYNCDONE and sets apply worker's latch. 3) Apply worker will reset the latch set by tablesync and go to main loop and wait in main loop latch(since tablesync worker's latch was already reset, apply worker will wait for 1 second) To fix this I had to set apply worker's latch once in 1ms in this case alone which is not a good solution as it will consume a lot of cpu cycles. A better fix for this would be to introduce a new subscription relation state. Attached patch has the changes for the same. 001, 0002 and 0003 are the patches shared by Melih and Kuroda-san earlier. 0004 patch has the changes for the POC of Tablesync worker pool implementation. POC design 1: Tablesync worker identifies the tables that should be synced and reuses the connection. POC design 2: Tablesync worker pool with apply worker scheduling the work to tablesync workers in the tablesync pool and reusing the connection. Performance results for 10 empty tables: +-------------------+--------------------+--------------------+----------------------+----------------+ | | 2 sync workers | 4 sync workers | 8 sync workers | 16 sync workers| +-------------------+--------------------+--------------------+----------------------+----------------+ | HEAD | 128.4685 ms | 121.271 ms | 136.5455 ms | N/A | +-------------------+--------------------+--------------------+----------------------+----------------+ | POC design#1| 70.7095 ms | 80.9805 ms | 102.773 ms | N/A | +-------------------+--------------------+--------------------+----------------------+----------------+ | POC design#2| 70.858 ms | 83.0845 ms | 112.505 ms | N/A | +-------------------+--------------------+--------------------+----------------------+----------------+ Performance results for 100 empty tables: +-------------------+--------------------+--------------------+----------------------+----------------+ | | 2 sync workers | 4 sync workers | 8 sync workers | 16 sync workers| +-------------------+--------------------+--------------------+----------------------+----------------+ | HEAD | 1039.89 ms | 860.88 ms | 1112.312 ms | 1122.52 ms | +-------------------+--------------------+--------------------+----------------------+----------------+ | POC design#1| 310.920 ms | 293.14 ms | 385.698 ms | 456.64 ms | +-------------------+--------------------+--------------------+----------------------+----------------+ | POC design#2 | 318.464 ms | 313.98 ms | 352.316 ms | 441.53 ms | +-------------------+--------------------+--------------------+----------------------+----------------+ Performance results for 1000 empty tables: +-------------------+--------------------+--------------------+----------------------+----------------+ | | 2 sync workers | 4 sync workers | 8 sync workers | 16 sync workers| +------------------+---------------------+---------------------+---------------------+----------------+ | HEAD | 16327.96 ms | 10253.65 ms | 9741.986 ms | 10278.98 ms | +-------------------+--------------------+---------------------+---------------------+----------------+ | POC design#1| 3598.21 ms | 3099.54 ms | 2944.386 ms | 2588.20 ms | +-------------------+--------------------+---------------------+---------------------+----------------+ | POC design#2| 4131.72 ms | 2840.36 ms | 3001.159 ms | 5461.82 ms | +-------------------+--------------------+---------------------+--------------------+----------------+ Performance results for 2000 empty tables: +-------------------+--------------------+--------------------+----------------------+----------------+ | | 2 sync workers | 4 sync workers | 8 sync workers | 16 sync workers| +-------------------+--------------------+--------------------+----------------------+----------------+ | HEAD | 47210.92 ms | 25239.90 ms | 19171.48 ms | 19556.46 ms | +-------------------+--------------------+--------------------+---------------------+----------------+ | POC design#1| 10598.32 ms | 6995.61 ms | 6507.53 ms | 5295.72 ms | +-------------------+--------------------+--------------------+-------------------------------------+ | POC design#2| 11121.00 ms | 6659.74 ms | 6253.66 ms | 15433.81 ms | +-------------------+--------------------+--------------------+-------------------------------------+ The performance result execution for the same is attached in Perftest_Results.xlsx. Also testing with a) table having data and b) apply worker applying changes while table sync in progress is not done. One of us will do and try to share the results for these too. It is noticed that performance of POC design #1 and POC design #2 are good but POC design #2's performance degrades when there are a greater number of workers and more tables. In POC design #2, when there are a greater number of workers and tables, apply worker is becoming a bottleneck as it must allocate work for all the workers. Based on the test results, POC design #1 is better. Thanks to Kuroda-san for helping me in running the performance tests. [1] - https://www.postgresql.org/message-id/CAGPVpCSk4v-V1WbFDy8a5dL7Es5z8da6hoQbuVyrqP5s3Yh6Cg%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CAHut%2BPs8gWP9tCPK9gdMnxyshRKgVP3pJnAnaJto_T07uR9xUA%40mail.gmail.com Regards, Vignesh
Attachment
pgsql-hackers by date: