Re: Skipping schema changes in publication - Mailing list pgsql-hackers
| From | vignesh C |
|---|---|
| Subject | Re: Skipping schema changes in publication |
| Date | |
| Msg-id | CALDaNm3kX=16L-72m13CqXL9uAiHURNZ+BLo-HfTEYHDFejj-A@mail.gmail.com Whole thread Raw |
| In response to | Re: Skipping schema changes in publication (vignesh C <vignesh21@gmail.com>) |
| Responses |
Re: Skipping schema changes in publication
|
| List | pgsql-hackers |
On Fri, 23 Jan 2026 at 18:41, vignesh C <vignesh21@gmail.com> wrote: > > On Wed, 21 Jan 2026 at 11:35, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Mon, Jan 19, 2026 at 3:08 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > Approaches for Supporting EXCEPT in Partitioned Tables > > > ------------------------------------------------------------------------ > > > > > > In an offline discussion with Peter Smith, Amit, and Shlok, we > > > identified several approaches for supporting EXCEPT with partitioned > > > tables and their partitions. I’d like to hear others’ opinions on > > > these approaches. > > > > > > Consider the following partition hierarchy: > > > tab_root > > > ├─ tab_part_1 > > > │ ├─ tab_part_1_p1 > > > │ └─ tab_part_1_p2 > > > └─ tab_part_2 > > > ├─ tab_part_2_p1 > > > └─ tab_part_2_p2 > > > > > > > > > Approach 1: > > > --------------------------------- > > > If we exclude a table, then the data in that table and all of its > > > partitions (i.e., the entire subtree under that table) should not be > > > replicated. > > > > > > For example EXCEPT (tab_part_1) skips replication of tab_part_1 and > > > all of its partitions. > > > > > > This behaviour remains the same with or without > > > publish_via_partition_root. The publish_via_partition_root flag only > > > affects publish_via_relid, i.e., the relation through which data is > > > published. > > > > > > This approach involves certain implementation challenges. For brevity, > > > these are documented in the attached 'Approach1_challenges' document. > > > > > > Approach 2: > > > --------------------------------------------------- > > > Assign meaning to ONLY and '*' for partition tables in the EXCEPT > > > list. In HEAD, ONLY and '*' do not have any meaning for partitioned > > > tables or partitions, and these keywords are currently ignored. > > > > > > Examples: > > > 1. EXCEPT (ONLY tab_part_1) skips replication of only the table > > > tab_part_1. Changes for tab_root, tab_part_1_p1, and tab_part_1_p2 are > > > still replicated. > > > > > > ii. EXCEPT (tab_part_1*) skips replication of tables tab_part_1, > > > tab_part_1_p1, and tab_part_1_p2 > > > > > > The challenges described in Approach 1, particularly around tablesync > > > handling and COPY behaviour, would still need to be addressed under > > > this approach as well. ONLY or '*' with partitioned tables is not > > > supported in HEAD, supporting it specifically for ALL TABLES EXCEPT > > > may introduce additional confusion for users. > > > > > > Approach 3: > > > ---------------- > > > Do not allow partitions to be specified in the EXCEPT clause. > > > > > > Only EXCEPT (tab_root) is supported, which excludes tab_root and all > > > of its partitions. Specifying EXCEPT (tab_part_1) or EXCEPT > > > (tab_part_1_p1) will result in an error. > > > > > > ~~ > > > > > > While Approach 1 and Approach 2 offer more flexibility to the user > > > compared to Approach 3, they also introduce additional design > > > complexity which does not seem simpler to address. > > > > Thanks for explaining this, overall I like the Approach 1, and I also > > see the problem when publish via root is given in that case COPY FROM > > is executed on the root and it would be hard to exclude specific > > partitions. > > Regarding the above issue which is also mentioned in > Approach1_challenges at [1]: > When a publication is created with publish_via_partition_root = true > and a specific partition(tab_part_1_1) is excluded, the expected > behavior is that changes from non-excluded partitions (for example, > tab_part_2 and tab_part_1_2 and their descendants) are replicated, > while changes from the excluded partition (tab_part_1_1 and its > subtree) are not. > tab_root > ├── tab_part_1 > │ ├── tab_part_1_1 (except) > │ │ ├── tab_part_1_1_1 > │ │ │ └── tab_part_1_1_1_1 > │ │ └── tab_part_1_1_2 > │ └── tab_part_1_2 > │ ├── tab_part_1_2_1 > │ └── tab_part_1_2_2 > └── tab_part_2 > > In this situation, replication cannot be performed purely via the > partition root (tab_root), because doing so would implicitly include > data from the excluded child partitions. > > To address this, the publication creation should explicitly record the > excluded partition(tab_part_1_1) in pg_publication_rel with an > excluded = true flag. The publish_via_partition_root setting remains > stored at the publication level, as it is today. With > publish_via_partition_root = true, the publisher–subscriber mapping is > not partition-to-partition. Instead, all eligible data is mapped to > the subscriber’s partition root. Therefore, > pg_get_publication_tables() should return only the top-level root > table (tab_root) to the subscriber for table synchronization. During > initial table sync, when the tablesync worker prepares the COPY > command, it can query the publisher to determine the effective set of > tables that belong to the publication after applying the exclusion > rules. Based on this resolved table list, the tablesync worker can > construct a COPY query that unions data only from the non-excluded > partitions, for example: > COPY ( > SELECT * FROM tab_part_1_2_1 > UNION ALL > SELECT * FROM tab_part_1_2_2 > UNION ALL > SELECT * FROM tab_part_2 > ) > > This ensures that only non-excluded data is copied and applied to > tab_root on the subscriber, while preserving the semantics of > publish_via_partition_root = true. Here is a patch which has the changes to handle the same. Regards, Vignesh
Attachment
pgsql-hackers by date: