Re: [HACKERS] logical decoding of two-phase transactions - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [HACKERS] logical decoding of two-phase transactions |
Date | |
Msg-id | CAA4eK1LfYo_3SminY2obsauNEtxOGf42ypBz+8pXJ9CaWh1gnw@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] logical decoding of two-phase transactions (Ajin Cherian <itsajin@gmail.com>) |
Responses |
Re: [HACKERS] logical decoding of two-phase transactions
|
List | pgsql-hackers |
On Wed, Sep 9, 2020 at 3:33 PM Ajin Cherian <itsajin@gmail.com> wrote: > > On Mon, Sep 7, 2020 at 11:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> >> Nikhil has a test for the same >> (0004-Teach-test_decoding-plugin-to-work-with-2PC.Jan4) in his last >> email [1]. You might want to use it to test this behavior. I think you >> can also keep the tests as a separate patch as Nikhil had. >> > Done. I've added the tests and also tweaked code to make sure that the aborts during 2 phase commits are also handled. >> Okay, I'll look into your changes but before that today, I have gone through this entire thread to check if there are any design problems and found that there were two major issues in the original proposal, (a) one was to handle concurrent aborts which I think we should be able to deal in a way similar to what we have done for decoding of in-progress transactions and (b) what if someone specifically locks pg_class or pg_attribute in exclusive mode (say be Lock pg_attribute ...), it seems the deadlock can happen in that case [0]. AFAIU, people seem to think if there is no realistic scenario where deadlock can happen apart from user explicitly locking the system catalog then we might be able to get away by just ignoring such xacts to be decoded at prepare time or would block it in some other way as any way that will block the entire system. I am not sure what is the right thing but something has to be done to avoid any sort of deadlock for this. Another thing, I noticed is that originally we have subscriber-side support as well, see [1] (see *pgoutput* patch) but later dropped it due to some reasons [2]. I think we should have pgoutput support as well, so see what is required to get that incorporated. I would also like to summarize my thinking on the usefulness of this feature. One of the authors of this patch Stats wants this for a conflict-free logical replication, see more details [3]. Craig seems to suggest [3] that this will allow us to avoid conflicting schema changes at different nodes though it is not clear to me if that is possible without some external code support because we don't send schema changes in logical replication, maybe Craig can shed some light on this. Another use-case, I am thinking is if this can be used for scaling-out reads as well. Because of 2PC, we can ensure that on subscribers we have all the data committed on the master. Now, we can design a system where different nodes are owners of some set of tables and we can always get the data of those tables reliably from those nodes, and then one can have some external process that will route the reads accordingly. I know that the last idea is a bit of a hand-waving but it seems to be possible after this feature. [0] - https://www.postgresql.org/message-id/20170328012546.473psm6546bgsi2c%40alap3.anarazel.de [1] - https://www.postgresql.org/message-id/CAMGcDxchx%3D0PeQBVLzrgYG2AQ49QSRxHj5DCp7yy0QrJR0S0nA%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CAMGcDxc-kuO9uq0zRCRwbHWBj_rePY9%3DraR7M9pZGWoj9EOGdg%40mail.gmail.com [3] - https://www.postgresql.org/message-id/CAMsr%2BYHQzGxnR-peT4SbX2-xiG2uApJMTgZ4a3TiRBM6COyfqg%40mail.gmail.com -- With Regards, Amit Kapila.
pgsql-hackers by date: