RE: Data is copied twice when specifying both child and parent table in publication - Mailing list pgsql-hackers

From houzj.fnst@fujitsu.com
Subject RE: Data is copied twice when specifying both child and parent table in publication
Date
Msg-id OS0PR01MB5716A30DDEECC59132E1084F94799@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Data is copied twice when specifying both child and parent table in publication  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
On Wednesday, August 10, 2022 7:45 AM Peter Smith <smithpb2250@gmail.com> wrote:
> 
> Here are some more review comments for the HEAD_v8 patch:
> 
> ======
> 
> 1. Commit message
> 
> If there are two publications, one of them publish a parent table with
> (publish_via_partition_root = true) and another publish child table, subscribing
> to both publications from one subscription results in two initial replications. It
> should only be copied once.
> 
> ~
> 
> I took a 2nd look at that commit message and it seemed slightly backwards to
> me - e.g. don't you really mean for the 'publish_via_partition_root' parameter
> to be used when publishing the
> *child* table?

I'm not sure about this, I think we are trying to fix the bug when
'publish_via_partition_root' is used when publishing the parent table.

For this case(via_root used when publishing parent):

CREATE PUBLICATION pub1 for TABLE parent with(publish_via_partition_root);
CREATE PUBLICATION pub2 for TABLE child;
CREATE SUBSCRIPTION sub connect xxx PUBLICATION pub1,pub2;

The expected behavior is only the parent table is published, all the changes
should be replicated using the parent table's identity. So, we should only do
initial sync for the parent table once, but we currently will do table sync for
both parent and child which we think is a bug.

For another case you mentioned(via_root used when publishing child)

CREATE PUBLICATION pub1 for TABLE parent;
CREATE PUBLICATION pub2 for TABLE child with (publish_via_partition_root);
CREATE SUBSCRIPTION sub connect xxx PUBLICATION pub1,pub2;

The expected behavior is only the child table is published, all the changes
should be replicated using the child table's identity. We should do table sync
only for child tables and is same as the current behavior on HEAD. So, I think
there is no bug in this case.

Best regards,
Hou zj


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: replacing role-level NOINHERIT with a grant-level option
Next
From: David Rowley
Date:
Subject: Re: Reducing the chunk header sizes on all memory context types