Multi-Master Logical Replication - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Multi-Master Logical Replication |
Date | |
Msg-id | CAHut+PuwRAoWY9pz=Eubps3ooQCOBFiYPU9Yi=VB-U+yORU7OA@mail.gmail.com Whole thread Raw |
Responses |
Re: Multi-Master Logical Replication
Re: Multi-Master Logical Replication Re: Multi-Master Logical Replication |
List | pgsql-hackers |
MULTI-MASTER LOGICAL REPLICATION 1.0 BACKGROUND Let’s assume that a user wishes to set up a multi-master environment so that a set of PostgreSQL instances (nodes) use logical replication to share tables with every other node in the set. We define this as a multi-master logical replication (MMLR) node-set. <please refer to the attached node-set diagram> 1.1 ADVANTAGES OF MMLR - Increases write scalability (e.g., all nodes can write arbitrary data). - Allows load balancing - Allows rolling updates of nodes (e.g., logical replication works between different major versions of PostgreSQL). - Improves the availability of the system (e.g., no single point of failure) - Improves performance (e.g., lower latencies for geographically local nodes) 2.0 MMLR AND POSTGRESQL It is already possible to configure a kind of MMLR set in PostgreSQL 15 using PUB/SUB, but it is very restrictive because it can only work when no two nodes operate on the same table. This is because when two nodes try to share the same table then there becomes a circular recursive problem where Node1 replicates data to Node2 which is then replicated back to Node1 and so on. To prevent the circular recursive problem Vignesh is developing a patch [1] that introduces new SUBSCRIPTION options "local_only" (for publishing only data originating at the publisher node) and "copy_data=force". Using this patch, we have created a script [2] demonstrating how to set up all the above multi-node examples. An overview of the necessary steps is given in the next section. 2.1 STEPS – Adding a new node N to an existing node-set step 1. Prerequisites – Apply Vignesh’s patch [1]. All nodes in the set must be visible to each other by a known CONNECTION. All shared tables must already be defined on all nodes. step 2. On node N do CREATE PUBLICATION pub_N FOR ALL TABLES step 3. All other nodes then CREATE SUBSCRIPTION to PUBLICATION pub_N with "local_only=on, copy_data=on" (this will replicate initial data from the node N tables to every other node). step 4. On node N, temporarily ALTER PUBLICATION pub_N to prevent replication of 'truncate', then TRUNCATE all tables of node N, then re-allow replication of 'truncate'. step 5. On node N do CREATE SUBSCRIPTION to the publications of all other nodes in the set 5a. Specify "local_only=on, copy_data=force" for exactly one of the subscriptions (this will make the node N tables now have the same data as the other nodes) 5b. Specify "local_only=on, copy_data=off" for all other subscriptions. step 6. Result - Now changes to any table on any node should be replicated to every other node in the set. Note: Steps 4 and 5 need to be done within the same transaction to avoid loss of data in case of some command failure. (Because we can't perform create subscription in a transaction, we need to create the subscription in a disabled mode first and then enable it in the transaction). 2.2 DIFFICULTIES Notice that it becomes increasingly complex to configure MMLR manually as the number of nodes in the set increases. There are also some difficulties such as - dealing with initial table data - coordinating the timing to avoid concurrent updates - getting the SUBSCRIPTION options for copy_data exactly right. 3.0 PROPOSAL To make the MMLR setup simpler, we propose to create a new API that will hide all the step details and remove the burden on the user to get it right without mistakes. 3.1 MOTIVATION - MMLR (sharing the same tables) is not currently possible - Vignesh's patch [1] makes MMLR possible, but the manual setup is still quite difficult - An MMLR implementation can solve the timing problems (e.g., using Database Locking) 3.2 API Preferably the API would be implemented as new SQL functions in PostgreSQL core, however, implementation using a contrib module or some new SQL syntax may also be possible. SQL functions will be like below: - pg_mmlr_set_create = create a new set, and give it a name - pg_mmlr_node_attach = attach the current node to a specified set - pg_mmlr_node_detach = detach a specified node from a specified set - pg_mmlr_set_delete = delete a specified set For example, internally the pg_mmlr_node_attach API function would execute the equivalent of all the CREATE PUBLICATION, CREATE SUBSCRIPTION, and TRUNCATE steps described above. Notice this proposal has some external API similarities with the BDR extension [3] (which also provides multi-master logical replication), although we plan to implement it entirely using PostgreSQL’s PUB/SUB. 4.0 ACKNOWLEDGEMENTS The following people have contributed to this proposal – Hayato Kuroda, Vignesh C, Peter Smith, Amit Kapila. 5.0 REFERENCES [1] https://www.postgresql.org/message-id/flat/CALDaNm0gwjY_4HFxvvty01BOT01q_fJLKQ3pWP9%3D9orqubhjcQ%40mail.gmail.com [2] https://www.postgresql.org/message-id/CAHut%2BPvY2P%3DUL-X6maMA5QxFKdcdciRRCKDH3j%3D_hO8u2OyRYg%40mail.gmail.com [3] https://www.enterprisedb.com/docs/bdr/latest/ [END] ~~~ One of my colleagues will post more detailed information later. ------ Kind Regards, Peter Smith. Fujitsu Australia
Attachment
pgsql-hackers by date: