Re: Perform streaming logical transactions by background workers and parallel apply - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: Perform streaming logical transactions by background workers and parallel apply |
Date | |
Msg-id | CAHut+PvPwTQ_GpVMw3JgZzhWcwTrAiT-_5M-XzEO7GUnHH8hGA@mail.gmail.com Whole thread Raw |
In response to | Re: Perform streaming logical transactions by background workers and parallel apply (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Perform streaming logical transactions by background workers and parallel apply
|
List | pgsql-hackers |
On Tue, May 3, 2022 at 2:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, May 2, 2022 at 5:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Mon, May 2, 2022 at 6:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Mon, May 2, 2022 at 11:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > > > Are you planning to support "Transaction dependency" Amit mentioned in > > > > his first mail in this patch? IIUC since the background apply worker > > > > applies the streamed changes as soon as receiving them from the main > > > > apply worker, a conflict that doesn't happen in the current streaming > > > > logical replication could happen. > > > > > > > > > > This patch seems to be waiting for stream_stop to finish, so I don't > > > see how the issues related to "Transaction dependency" can arise? What > > > type of conflict/issues you have in mind? > > > > Suppose we set both publisher and subscriber: > > > > On publisher: > > create table test (i int); > > insert into test values (0); > > create publication test_pub for table test; > > > > On subscriber: > > create table test (i int primary key); > > create subscription test_sub connection '...' publication test_pub; -- > > value 0 is replicated via initial sync > > > > Now, both 'test' tables have value 0. > > > > And suppose two concurrent transactions are executed on the publisher > > in following order: > > > > TX-1: > > begin; > > insert into test select generate_series(0, 10000); -- changes will be streamed; > > > > TX-2: > > begin; > > delete from test where c = 0; > > commit; > > > > TX-1: > > commit; > > > > With the current streaming logical replication, these changes will be > > applied successfully since the deletion is applied before the > > (streamed) insertion. Whereas with the apply bgworker, it fails due to > > an unique constraint violation since the insertion is applied first. > > I've confirmed that it happens with v5 patch. > > > > Good point but I am not completely sure if doing transaction > dependency tracking for such cases is really worth it. I feel for such > concurrent cases users can anyway now also get conflicts, it is just a > matter of timing. One more thing to check transaction dependency, we > might need to spill the data for streaming transactions in which case > we might lose all the benefits of doing it via a background worker. Do > we see any simple way to avoid this? > Avoiding unexpected differences like this is why I suggested the option should have to be explicitly enabled instead of being on by default as it is in the current patch. See my review comment #14 [1]. It means the user won't have to change their existing code as a workaround. ------ [1] https://www.postgresql.org/message-id/CAHut%2BPuqYP5eD5wcSCtk%3Da6KuMjat2UCzqyGoE7sieCaBsVskQ%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: