Re: Data is copied twice when specifying both child and parent table in publication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Data is copied twice when specifying both child and parent table in publication
Date
Msg-id CAA4eK1Jpj2Yj5ZWwt4z-qM2fS0c3ZF3_xqAgor-Ziiru00LKHg@mail.gmail.com
Whole thread Raw
In response to Re: Data is copied twice when specifying both child and parent table in publication  (Peter Smith <smithpb2250@gmail.com>)
Responses RE: Data is copied twice when specifying both child and parent table in publication
List pgsql-hackers
On Mon, Mar 20, 2023 at 1:02 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
>
> ======
> src/include/catalog/pg_proc.dat
>
> 4.
> +{ oid => '6119',
> +  descr => 'get information of the tables in the given publication array',
>
> Should that be worded in a way to make it more clear that the
> "publication array" is really an "array of publication names"?
>

I don't know how important it is to tell that the array is an array of
publication names but the current description can be improved. How
about something like: "get information of the tables that are part of
the specified publications"

Few other comments:
=================
1.
  foreach(lc2, ancestors)
  {
  Oid ancestor = lfirst_oid(lc2);
+ ListCell   *lc3;

  /* Check if the parent table exists in the published table list. */
- if (list_member_oid(relids, ancestor))
+ foreach(lc3, table_infos)
  {
- skip = true;
- break;
+ Oid relid = ((published_rel *) lfirst(lc3))->relid;
+
+ if (relid == ancestor)
+ {
+ skip = true;
+ break;
+ }
  }
+
+ if (skip)
+ break;
  }

- if (!skip)
- result = lappend_oid(result, relid);
+ if (skip)
+ table_infos = foreach_delete_current(table_infos, lc);

The usage of skip looks a bit ugly to me. Can we move the code for the
inner loop to a separate function (like
is_ancestor_member_tableinfos()) and remove the current cell if it
returns true?

2.
  * Filter out the partitions whose parent tables were also specified in
  * the publication.
  */
-static List *
-filter_partitions(List *relids)
+static void
+filter_partitions(List *table_infos)

The comment atop filter_partitions is no longer valid. Can we slightly
change it to: "Filter out the partitions whose parent tables are also
present in the list."?

3.
-# Note: We create two separate tables, not a partitioned one, so that we can
-# easily identity through which relation were the changes replicated.
+# Note: We only create one table (tab4) here. We specified
+# publish_via_partition_root = true (see pub_all and pub_lower_level above), so
+# all data will be replicated to that table.
 $node_subscriber2->safe_psql('postgres',
  "CREATE TABLE tab4 (a int PRIMARY KEY)");
-$node_subscriber2->safe_psql('postgres',
- "CREATE TABLE tab4_1 (a int PRIMARY KEY)");

I am not sure if it is a good idea to remove tab4_1 here. It is
testing something different as mentioned in the comments. Also, I
don't see any data in tab4 for the initial sync, so not sure if this
tests the behavior changed by this patch.

4.
--- a/src/test/subscription/t/031_column_list.pl
+++ b/src/test/subscription/t/031_column_list.pl
@@ -959,7 +959,8 @@ $node_publisher->safe_psql(
  CREATE TABLE test_root_1 PARTITION OF test_root FOR VALUES FROM (1) TO (10);
  CREATE TABLE test_root_2 PARTITION OF test_root FOR VALUES FROM (10) TO (20);

- CREATE PUBLICATION pub_root_true FOR TABLE test_root (a) WITH
(publish_via_partition_root = true);
+ CREATE PUBLICATION pub_root_true_1 FOR TABLE test_root (a) WITH
(publish_via_partition_root = true);
+ CREATE PUBLICATION pub_root_true_2 FOR TABLE test_root_1 (a, b) WITH
(publish_via_partition_root = true);

  -- initial data
  INSERT INTO test_root VALUES (1, 2, 3);
@@ -968,7 +969,7 @@ $node_publisher->safe_psql(

 $node_subscriber->safe_psql(
  'postgres', qq(
- CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION
pub_root_true;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION
pub_root_true_1, pub_root_true_2;

It is not clear to me what exactly you want to test here. Please add
some comments.

5. I think you can merge the 0001 and 0003 patches.

Apart from the above, attached is a patch to change some of the
comments in the patch.

--
With Regards,
Amit Kapila.

Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Ignoring BRIN for HOT updates (was: -udpates seems broken)
Next
From: Matthias van de Meent
Date:
Subject: Re: Ignoring BRIN for HOT updates (was: -udpates seems broken)