Re: Added schema level support for publication. - Mailing list pgsql-hackers

From vignesh C
Subject Re: Added schema level support for publication.
Date
Msg-id CALDaNm3tifDjb3jum=cvBjtzzK3mJdboMVdbPZcz-wH+Xu=cEQ@mail.gmail.com
Whole thread Raw
In response to Re: Added schema level support for publication.  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
On Mon, Jan 11, 2021 at 11:45 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Sun, Jan 10, 2021 at 11:21 PM vignesh C <vignesh21@gmail.com> wrote:
> > On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > > I think this feature can be useful, in case a user has a lot of tables
> > > to publish inside a schema. Having said that, I wonder if this feature
> > > mandates users to create the same schema with same
> > > permissions/authorizations manually on the subscriber, because logical
> > > replication doesn't propagate any ddl's so are the schema or schema
> > > changes? Or is it that the list of tables from the publisher can go
> > > into a different schema on the subscriber?
> > >
> >
> > DDL's will not be propagated to the subscriber. Users have to create
> > the schema & tables in the subscriber. No change in
> > Permissions/authorizations handling, it will be the same as the
> > existing behavior for relations.
>
> Looks like the existing behaviour already requires users to create the
> schema on the subscriber when publishing the tables from that schema.
> Otherwise, an error is thrown on the subscriber [1].
>
> [1] on publisher:
> CREATE SCHEMA myschema;
> CREATE TABLE myschema.t1(a1 int, b1 int);
> INSERT INTO myschema.t1_myschema SELECT i, i+10 FROM generate_series(1,10) i;
> CREATE PUBLICATION testpub FOR TABLE myschema.t1;
>
> on subscriber:
> postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost
> dbname=postgres user=bharath port=5432' PUBLICATION testpub;
> ERROR:  schema "myschema" does not exist
> CREATE SCHEMA myschema;
> CREATE TABLE myschema.t1(a1 int, b1 int);
> postgres=# CREATE SUBSCRIPTION testsub CONNECTION 'host=localhost
> dbname=postgres user=bharath port=5432' PUBLICATION testpub;
> NOTICE:  created replication slot "testsub" on publisher
> CREATE SUBSCRIPTION
>

Yes this feature will also have the same behavior, DDL creation should
be taken care of by DBA similar to how it is handled may be using
pg_dump or use sql scripts/statements to update.

> > > Since the schema can have other objects such as data types, functions,
> > > operators, I'm sure with your feature, non-table objects will be
> > > skipped.
> > >
> >
> > Yes, only table data will be sent to subscribers, non-table objects
> > will be skipped.
>
> Looks like the existing CREATE PUBLICATION FOR ALL TABLES, which is
> for all the tables in the database, does this i.e. skips non-table
> objects and temporary tables, foreign tables and so on. So, your
> feature also can behave the same way, but within the scope of the
> given schema/s.
>

Yes, it will support only normal tables. Non table objects, foreign
tables & temporary tables will not be supported.

> > > As Amit pointed out earlier, the behaviour when schema dropped, I
> > > think we should also consider when schema is altered, say altered to a
> > > different name, maybe we should change that in the publication too.
> > >
> >
> > I agree that when schema is altered the renamed schema should be
> > reflected in the publication.
>
> I think, it's not only making sure that the publisher side has the new
> altered schema, but also the subscriber needs those alters. Having
> said that, since these alters come under DDL changes and in logical
> replication we don't publish the scheme changes to the subscriber, we
> may not need to anything extra for informing the schema alters to the
> subscriber from the publisher, the users might have to do the same
> schema alter on the subscriber and then a ALTER SUBSCRIPTION testsub
> REFRESH PUBLICATION;  should work for them? If this understanding is
> correct, then we should document this.
>

Yes, alter schema changes will be reflected in the publication, the
corresponding change needs to be done by the user on the subscriber
side. Once a user does ALTER SUBSCRIPTION testsub REFRESH PUBLICATION,
the new altered schema changes will be reflected in the subscriber. I
will update the documentation that user need to take care for
subscription refresh.

> > > In general, what happens if we have some temporary tables or foreign
> > > tables inside the schema, will they be allowed to send the data to
> > > subscribers?
> > >
> >
> > Temporary tables & foreign tables will not be added to the publications.
>
> Yes the existing logical replication framework doesn't allow
> replication of temporary, unlogged, foreign tables and other non-table
> relations such as materialized views, indexes etc [1]. The CREATE
> PUBLICATION statement either fails in check_publication_add_relation
> or before that.
>
> CREATE PUBLICATION testpub FOR TABLE tab1, throwing the error if the
> single table tab1 is any of the above restricted tables, seems fine.
> But, if there's a list of tables with CREATE PUBLICATION testpub FOR
> TABLE normal_tab1, temp_tab2, normal_tab3, foreign_tab4,
> unlogged_tab5, normal_tab6, normal_tab7 ......; This query fails on
> first encounter of the restricted table, say at temp_tab2. Whereas,
> CREATE PUBLICATION testpub FOR ALL TABLES; would skip the restricted
> tables and continue to add the accepted tables into the publication
> within the database.
>
> IMHO, if there's a list of tables specified with FOR TABLE, then
> instead of throwing an error in case of any restricted table, we can
> issue a warning and continue with the addition of accepted tables into
> the publication. If done, this behaviour will be in sync with FOR ALL
> TABLES;
>
> Thoughts? If okay, I can work on a patch.
>

I feel we can start a new thread for this to seek opinion if this base
change is required and reach consensus.

> Related to error messages: when foreign table is specified in CREATE
> PUBLICATION statement, then "ERROR:  "f1" is not a table", is thrown
> [1], how about the error message "ERROR: foerign table "f1" cannot be
> replicated". In general, it would be good if we could have the error
> messages something like in [2] instead of the existing [1].
>
> Thoughts? If okay, I can work on a patch.
>

I feel we can start a new thread for this to seek opinion and reach consensus.

> [1]
> t1 is a temporary table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  table "t1" cannot be replicated
> DETAIL:  Temporary and unlogged relations cannot be replicated.
>
> t1 is an unlogged table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  table "t1" cannot be replicated
> DETAIL:  Temporary and unlogged relations cannot be replicated.
>
> f1 is a foreign table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
> ERROR:  "f1" is not a table
> DETAIL:  Only tables can be added to publications.
>
> mv1 is a mat view:
> postgres=# CREATE PUBLICATION testpub FOR TABLE mv1;
> ERROR:  "mv1" is not a table
>
> idx1 is an index:
> postgres=# CREATE PUBLICATION testpub FOR TABLE idx1;
> ERROR:  "idx1" is an index
>
> [2]
> t1 is a temporary table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  temporary table "t1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
> t1 is an unlogged table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE t1;
> ERROR:  unlogged table "t1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
> f1 is a foreign table:
> postgres=# CREATE PUBLICATION testpub FOR TABLE f1;
> ERROR:  foreign table "f1" cannot be replicated
> DETAIL:  Temporary, unlogged and foreign relations cannot be replicated.
>
> > > Say a user has created a publication for a schema with hundreds of
> > > tables in it, at some point later, can he stop replicating a single or
> > > some tables from that schema?
> > >
> >
> > There is no provision for this currently.
>
> The documentation [1] says, we can ALTER PUBLICATION testpub DROP
> TABLE t1; which removes the table from the list of published tables,
> but looks like it requires ALTER SUBSCRIPTION testsub REFRESH
> PUBLICATION; for the changes to become effective on the subscriber. I
> have done some testing for this case:
> 1) created publication for table t1, see \d+ t1, the associated
> publication is visible in the output
> 2) created subscription on the subscriber, initial available data from
> the publisher for table t1 is received
> 3) insert into table t1 on the publisher
> 4) inserted data in (3) is received in the subscriber table t1
> 5) alter publication to drop the table t1 on the publisher, see \d+
> t1, there will not be any associated publication in the output
> 6) execute alter subscription refresh publication on the subscriber,
> with the expectation that it should not receive the data from the
> publisher for the table t1 since it's dropped from the publication in
> (5)
> 7) insert into table t1 on the publisher
> 8) still the newly inserted data in (7) from the publisher, will be
> received into the table t1 in the subscriber
>
> IIUC, the behaviour of ALTER PUBLICATION DROP TABLE from the docs and
> the above use case, it looks like a bug to me. If I'm wrong, can
> someone correct me?
>
> Thoughts?

I think you started a new thread for this, let's conclude on this there.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: Re: libpq compression
Next
From: Tom Lane
Date:
Subject: Re: Executing inet_server_addr/port() in parallel workers