Re: Skipping schema changes in publication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Skipping schema changes in publication
Date
Msg-id CAA4eK1KikfJMVggZ_aNWPio7zrdObVdidG3SZVWEZ_LOU=vXkg@mail.gmail.com
Whole thread Raw
In response to Re: Skipping schema changes in publication  ("Euler Taveira" <euler@eulerto.com>)
Responses Re: Skipping schema changes in publication  (vignesh C <vignesh21@gmail.com>)
List pgsql-hackers
On Fri, Apr 15, 2022 at 1:26 AM Euler Taveira <euler@eulerto.com> wrote:
>
> On Thu, Apr 14, 2022, at 10:47 AM, Peter Eisentraut wrote:
>
> On 12.04.22 08:23, vignesh C wrote:
> > I have also included the implementation for skipping a few tables from
> > all tables publication, the 0002 patch has the implementation for the
> > same.
> > This feature is helpful for use cases where the user wants to
> > subscribe to all the changes except for the changes present in a few
> > tables.
> > Ex:
> > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;
> > OR
> > ALTER PUBLICATION pub1 ADD SKIP  TABLE t1,t2;
>
> We have already allocated the "skip" terminology for skipping
> transactions, which is a dynamic run-time action.  We are also using the
> term "skip" elsewhere to skip locked rows, which is similarly a run-time
> action.  I think it would be confusing to use the term SKIP for DDL
> construction.
>
> I didn't like the SKIP choice too. We already have EXCEPT for IMPORT FOREIGN
> SCHEMA and if I were to suggest a keyword, it would be EXCEPT.
>

+1 for EXCEPT.

> I would also think about this in broader terms.  For example, sometimes
> people want features like "all columns except these" in certain places.
> The syntax for those things should be similar.
>
> The questions are:
> What kind of issues does it solve?

As far as I understand, it is for usability, otherwise, users need to
list all required columns' names even if they don't want to hide most
of the columns in the table. Consider user doesn't want to publish the
'salary' or other sensitive information of executives/employees but
would like to publish all other columns. I feel in such cases it will
be a lot of work for the user especially when the table has many
columns. I see that Oracle has a similar feature [1]. I think without
this it will be difficult for users to use this feature in some cases.

> Do we have a workaround for it?
>

I can't think of any except the user needs to manually input all
required columns. Can you think of any other workaround?

> That said, I'm not sure this feature is worth the trouble.  If this is
> useful, what about "whole database except these schemas"?  What about
> "create this database from this template except these schemas".  This
> could get out of hand.  I think we should encourage users to group their
> object the way they want and not offer these complicated negative
> selection mechanisms.
>
> I have the same impression too. We already provide a way to:
>
> * include individual tables;
> * include all tables;
> * include all tables in a certain schema.
>
> Doesn't it cover the majority of the use cases?
>

Similar to columns, the same applies to tables. Users need to manually
add all tables for a database even when she wants to avoid only a
handful of tables from the database say because they contain sensitive
information or are not required. I think we don't need to cover all
possible exceptions but a few where users can avoid some tables would
be useful. If not, what kind of alternative do users have except for
listing all columns or all tables that are required.


[1] -
https://docs.oracle.com/en/cloud/paas/goldengate-cloud/gwuad/selecting-columns.html#GUID-9A851C8B-48F7-43DF-8D98-D086BE069E20

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: New Object Access Type hooks
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508