Re: PG Docs for ALTER SUBSCRIPTION REFRESH PUBLICATION - copy_data option - Mailing list pgsql-hackers

From Peter Smith
Subject Re: PG Docs for ALTER SUBSCRIPTION REFRESH PUBLICATION - copy_data option
Date
Msg-id CAHut+PvJMRB-ZyC80we2kiUFv4cVjmA6jxXpEMhm1rmz=1ryeA@mail.gmail.com
Whole thread Raw
In response to Re: PG Docs for ALTER SUBSCRIPTION REFRESH PUBLICATION - copy_data option  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: PG Docs for ALTER SUBSCRIPTION REFRESH PUBLICATION - copy_data option
List pgsql-hackers
On Tue, Sep 14, 2021 at 8:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jun 25, 2021 at 9:20 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > But I recently learned that when there are partitions in the
> > publication, then toggling the value of the PUBLICATION option
> > "publish_via_partition_root" [3] can also *implicitly* change the list
> > published tables, and therefore that too might cause any ASRP to make
> > use of the copy_data value for those implicitly added
> > partitions/tables.
> >
>
> I have tried the below example in this context but didn't see any
> effect on changing via_root option.

Thanks for trying to reproduce. I also thought your steps were the
same as what I'd previously done but it seems like it was a bit
different. Below are my steps to observe some unexpected COPY
happening. Actually, now I am no longer sure if this is just a
documentation issue; perhaps it is a bug.

STEP 1 - create partition tables on both sides
===================================

[PUB and SUB]

postgres=# create table troot (a int) partition by range(a);
CREATE TABLE
postgres=# create table tless10 partition of troot for values from (1) to (9);
CREATE TABLE
postgres=# create table tmore10 partition of troot for values from (10) to (99);
CREATE TABLE

STEP 2 - insert some data on pub-side
==============================

[PUB]

postgres=# insert into troot values (1),(2),(3);
INSERT 0 3
postgres=# insert into troot values (11),(12),(13);
INSERT 0 3

postgres=# select * from troot;
 a
----
  1
  2
  3
 11
 12
 13
(6 rows)

STEP 3 - create a publication on the partition root
======================================

[PUB]

postgres=# CREATE PUBLICATION pub1 FOR TABLE troot;
CREATE PUBLICATION
postgres=# \dRp+ pub1;
                              Publication pub1
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
 postgres | f          | t       | t       | t       | t         | f
Tables:
    "public.troot"


STEP 4 - create the subscriber
=======================

[SUB]

postgres=# CREATE SUBSCRIPTION sub1 CONNECTION 'host=127.0.0.1
port=5432 dbname=postgres' PUBLICATION pub1;
NOTICE:  created replication slot "sub1" on publisher
CREATE SUBSCRIPTION
postgres=# 2021-09-15 12:45:12.224 AEST [30592] LOG:  logical
replication apply worker for subscription "sub1" has started
2021-09-15 12:45:12.236 AEST [30595] LOG:  logical replication table
synchronization worker for subscription "sub1", table "tless10" has
started
2021-09-15 12:45:12.247 AEST [30598] LOG:  logical replication table
synchronization worker for subscription "sub1", table "tmore10" has
started
2021-09-15 12:45:12.326 AEST [30595] LOG:  logical replication table
synchronization worker for subscription "sub1", table "tless10" has
finished
2021-09-15 12:45:12.332 AEST [30598] LOG:  logical replication table
synchronization worker for subscription "sub1", table "tmore10" has
finished

postgres=# select * from troot;
 a
----
  1
  2
  3
 11
 12
 13
(6 rows)

// To this point, everything looks OK...

STEP 5 - toggle the publish_via_partition_root flag
======================================

[PUB]

postgres=# alter publication pub1 set (publish_via_partition_root = true);
ALTER PUBLICATION
postgres=# \dRp+ pub1;
                              Publication pub1
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
 postgres | f          | t       | t       | t       | t         | t
Tables:
    "public.troot"

// And then refresh the subscriber

[SUB]

postgres=# alter subscription sub1 refresh PUBLICATION;
ALTER SUBSCRIPTION
postgres=# 2021-09-15 12:48:37.927 AEST [3861] LOG:  logical
replication table synchronization worker for subscription "sub1",
table "troot" has started
2021-09-15 12:48:37.977 AEST [3861] LOG:  logical replication table
synchronization worker for subscription "sub1", table "troot" has
finished

// Notice above that another tablesync worker has launched and copied
everything again - BUG??

[SUB]

postgres=# select * from troot;
 a
----
  1
  2
  3
  1
  2
  3
 11
 12
 13
 11
 12
 13
(12 rows)

// At this point if I would keep toggling the
publish_via_partition_root then each time I do subscription REFRESH
PUBLICATION it will copy the data yet again. For example,

[PUB]

postgres=# alter publication pub1 set (publish_via_partition_root = false);
ALTER PUBLICATION

[SUB]

postgres=# alter subscription sub1 refresh PUBLICATION;
ALTER SUBSCRIPTION
postgres=# 2021-09-15 12:59:02.106 AEST [21709] LOG:  logical
replication table synchronization worker for subscription "sub1",
table "tless10" has started
2021-09-15 12:59:02.120 AEST [21711] LOG:  logical replication table
synchronization worker for subscription "sub1", table "tmore10" has
started
2021-09-15 12:59:02.189 AEST [21709] LOG:  logical replication table
synchronization worker for subscription "sub1", table "tless10" has
finished
2021-09-15 12:59:02.207 AEST [21711] LOG:  logical replication table
synchronization worker for subscription "sub1", table "tmore10" has
finished

By now the pub/sub data on each side is quite different
==========================================

[PUB]

postgres=# select count(*) from troot;
 count
-------
     6
(1 row)

[SUB]

postgres=# select count(*) from troot;
 count
-------
    18
(1 row)


------
Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Estimating HugePages Requirements?
Next
From: Amit Kapila
Date:
Subject: Re: Column Filtering in Logical Replication