Re: RFC: logical publication via inheritance root? - Mailing list pgsql-hackers
From | Jacob Champion |
---|---|
Subject | Re: RFC: logical publication via inheritance root? |
Date | |
Msg-id | dc57f088-039b-7a71-8f4c-082ef106246e@timescale.com Whole thread Raw |
In response to | Re: RFC: logical publication via inheritance root? (Jacob Champion <jchampion@timescale.com>) |
Responses |
Re: RFC: logical publication via inheritance root?
Re: RFC: logical publication via inheritance root? Re: RFC: logical publication via inheritance root? |
List | pgsql-hackers |
Hi, I'm going to register this in CF for feedback. Summary for potential reviewers: we don't use declarative partitions in the Timescale partitioning scheme, but it'd be really nice to be able to replicate between our tables and standard tables, or between two Timescale-partitioned tables with different layouts. This patch lets extensions (or savvy users) upgrade an existing inheritance relationship between two tables into a "logical partition" relationship, so that they can be handled with the publish_via_partition_root machinery. I hope this might also help pg_partman users migrate between old- and new-style partition schemes, but that's speculation. On 1/20/23 09:53, Jacob Champion wrote: >> 2) While this strategy works well for ongoing replication, it's not >> enough to get the initial synchronization correct. The subscriber >> still does a COPY of the root table directly, missing out on all the >> logical descendant data. The publisher will have to tell the >> subscriber about the relationship somehow, and older subscriber >> versions won't understand how to use that (similar to how old >> subscribers can't correctly handle row filters). > > I partially solved this by having the subscriber pull the logical > hierarchy from the publisher to figure out which tables to COPY. This > works when publish_via_partition_root=true, but it doesn't correctly > return to the previous behavior when the setting is false. I need to > check the publication setting from the subscriber, too, but that opens > up the question of what to do if two different publications conflict. Second draft attached, which fixes that bug. I kept thinking to myself that this would be much easier if the publisher told the subscriber what data to copy rather than having the subscriber hardcode the initial sync process... and then I realized that I could, sort of, move in that direction. This version adds a SQL function to determine the list of source tables to COPY into a subscriber's target table. Now the publisher can make use of whatever catalogs it needs to make that list and the subscriber doesn't need to couple to them. (This could also provide a way for publishers to provide more generic "table indirection" in the future, but I'm wary of selling genericism as a feature here.) I haven't solved the problem where two publications of the same table have different settings for publish_via_partition_root. I was curious to see how the existing partition code prevented problems, but I'm not really sure that it does... Here are some situations where the existing implementation duplicates data on the initial sync: 1) A single subscription to two publications, one with publish_via_partition_root on and the other off, which publish the same partitioned table 2) A single subscription to two publications with publish_via_partition_root on, one of which publishes a root partition and the other of which publishes a descendant/leaf 3) A single subscription to two publications with publish_via_partition_root on, one of which publishes FOR ALL TABLES and the other of which publishes a descendant/leaf Is it expected that DBAs should avoid these cases, or are they worth pursuing with a bug fix? Thanks, --Jacob
Attachment
pgsql-hackers by date: