Re: Data is copied twice when specifying both child and parent table in publication - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Data is copied twice when specifying both child and parent table in publication
Date
Msg-id CAHut+PtvCPuz21aZ2HxE5+4tbjuEmT2=_ZJmEXpuT3b274sKuw@mail.gmail.com
Whole thread Raw
In response to RE: Data is copied twice when specifying both child and parent table in publication  ("wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com>)
Responses RE: Data is copied twice when specifying both child and parent table in publication
List pgsql-hackers
Here are my review comments for HEAD patches v13*

//////

Patch HEAD_v13-0001

I already posted some follow-up questions. See [1]

/////

Patch HEAD_v13-0002

1. Commit message

The following usage scenarios are not described in detail in the manual:
If one subscription subscribes multiple publications, and these publications
publish a partitioned table and its partitions respectively. When we specify
this parameter on one or more of these publications, which identity and schema
should be used to publish the changes?

In these cases, I think the parameter publish_via_partition_root behave as
follows:

~

It seemed worded a bit strangely. Also, you said "on one or more of
these publications" but the examples only show only one publication
having 'publish_via_partition_root'.

SUGGESTION (I've modified the wording slightly but the examples are unchanged).

Assume a subscription is subscribing to multiple publications, and
these publications publish a partitioned table and its partitions
respectively:

[publisher-side]
create table parent (a int primary key) partition by range (a);
create table child partition of parent default;

create publication pub1 for table parent;
create publication pub2 for table child;

[subscriber-side]
create subscription sub connection 'xxxx' publication pub1, pub2;

The manual does not clearly describe the behaviour when the user had
specified the parameter 'publish_via_partition_root' on just one of
the publications. This patch modifies documentation to clarify the
following rules:

- If the parameter publish_via_partition_root is specified only in pub1,
changes will be published using the identity and schema of the table 'parent'.

- If the parameter publish_via_partition_root is specified only in pub2,
changes will be published using the identity and schema of the table 'child'.

~~~

2.

- If the parameter publish_via_partition_root is specified only in pub2,
changes will be published using the identity and schema of the table child.

~

Is that right though? This rule seems 100% contrary to the meaning of
'publish_via_partition_root=true'.

------

3. doc/src/sgml/ref/create_publication.sgml

+         <para>
+          If a root partitioned table is published by any subscribed
publications which
+          set publish_via_partition_root = true, changes on this root
partitioned table
+          (or on its partitions) will be published using the identity
and schema of this
+          root partitioned table rather than that of the individual partitions.
+         </para>

This seems to only describe the first example from the commit message.
What about some description to explain the second example?

------
[1] https://www.postgresql.org/message-id/CAHut%2BPt%2B1PNx6VsZ-xKzAU-18HmNXhjCC1TGakKX46Wg7YNT1Q%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Data is copied twice when specifying both child and parent table in publication
Next
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply