Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION |
Date | |
Msg-id | CAA4eK1Ka5W2u-8-X-w8PrwzUf3NUC1Bdvf3DY3Y6LU6V4jaZ1w@mail.gmail.com Whole thread Raw |
In response to | Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
|
List | pgsql-hackers |
On Mon, Dec 4, 2023 at 5:30 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > On 12/4/23 12:37, Amit Kapila wrote: > > On Sat, Dec 2, 2023 at 9:52 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > >> > >>> thread. I think you can compare the timing of regression tests in > >>> subscription, with and without the patch to show there is no > >>> regression. And probably some tests with a large number of tables for > >>> sync with very little data. > >> > >> I have tested the regression test timings for subscription with and > >> without patch. I also did the timing test for sync of subscription > >> with the publisher for 100 and 1000 tables respectively. > >> I have attached the test script and results of the timing test are as follows: > >> > >> Time taken for test to run in Linux VM > >> Summary | Subscription Test (sec) > >> | 100 tables in pub and Sub (sec) | 1000 tables in pub and Sub > >> (sec) > >> Without patch Release | 95.564 > >> | 7.877 | 58.919 > >> With patch Release | 96.513 > >> | 6.533 | 45.807 > >> > >> Time Taken for test to run in another Linux VM > >> Summary | Subscription Test (sec) > >> | 100 tables in pub and Sub (sec) | 1000 tables in pub and Sub > >> (sec) > >> Without patch Release | 109.8145 > >> | 6.4675 | 83.001 > >> With patch Release | 113.162 > >> | 7.947 | 87.113 > >> > > > > So, on some machines, it may increase the test timing although not too > > much. I think the reason is probably doing the work in multiple > > transactions for multiple relations. I am wondering that instead of > > committing and starting a new transaction before > > wait_for_relation_state_change(), what if we do it inside that > > function just before we decide to wait? It is quite possible that in > > many cases we don't need any wait at all. > > > > I'm not sure what you mean by "do it". What should the function do? > I mean to commit the open transaction at the below place in wait_for_relation_state_change() wait_for_relation_state_change() { ... -- commit the xact WaitLatch(); ... } Then start after the wait is over. This is just to test whether it improves the difference in regression test timing. > As for the test results, I very much doubt the differences are not > caused simply by random timing variations, or something like that. And I > don't understand what "Performance Machine Linux" is, considering those > timings are slower than the other two machines. > > Also, even if it was a bit slower, does it really matter? I mean, the > current code is wrong, can lead to infinite duration if it happens to > hit the deadlock. And it's a one-time action, I don't think it's a very > sensitive in terms of performance. > Yeah, I see that point but trying to evaluate if we can avoid an increase in regression test timing at least for the HEAD. The tests are done in release mode, so, I suspect there could be a slightly bigger gap in debug mode (we can check that once though) which may hit developers running regressions quite often in their development environments. Now, if we there is no easy way to avoid the increase in regression test timing, we have to still fix the problem, so we have to take the hit of some increase in time. -- With Regards, Amit Kapila.
pgsql-hackers by date: