Re: Column Filtering in Logical Replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Column Filtering in Logical Replication
Date
Msg-id CAA4eK1L+=1aP96nCS_mAnWU6b=hj_L7X8OUMQ=ghmnzQFrjbGQ@mail.gmail.com
Whole thread Raw
In response to Re: Column Filtering in Logical Replication  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: Column Filtering in Logical Replication
List pgsql-hackers
On Mon, Sep 6, 2021 at 11:21 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2021-Sep-06, Rahila Syed wrote:
>
> > > > ... ugh.  Since CASCADE is already defined to be a
> > > > potentially-data-loss operation, then that may be acceptable
> > > > behavior.  For sure the default RESTRICT behavior shouldn't do it,
> > > > though.
> > >
> > > That makes sense to me.
> >
> > However, the default (RESTRICT) behaviour of DROP TABLE allows
> > removing the table from the publication. I have implemented the
> > removal of table from publication on drop column (RESTRICT)  on the
> > same lines.
>
> But dropping the table is quite a different action from dropping a
> column, isn't it?  If you drop a table, it seems perfectly reasonable
> that it has to be removed from the publication -- essentially, when the
> user drops a table, she is saying "I don't care about this table
> anymore".  However, if you drop just one column, that doesn't
> necessarily mean that the user wants to stop publishing the whole table.
> Removing the table from the publication in ALTER TABLE DROP COLUMN seems
> like an overreaction.  (Except perhaps in the special case were the
> column being dropped is the only one that was being published.)
>
> So let's discuss what should happen.  If you drop a column, and the
> column is filtered out, then it seems to me that the publication should
> continue to have the table, and it should continue to filter out the
> other columns that were being filtered out, regardless of CASCADE/RESTRICT.
>

Yeah, for this case we don't need to do anything and I am not sure if
the patch is dropping tables in this case?

> However, if the column is *included* in the publication, and you drop
> it, ISTM there are two cases:
>
> 1. If it's DROP CASCADE, then the list of columns to replicate should
> continue to have all columns it previously had, so just remove the
> column that is being dropped.
>

Note that for a somewhat similar case in the index (where the index
has an expression) we drop the index if one of the columns used in the
index expression is dropped, so we might want to just remove the
entire filter here instead of just removing the particular column or
remove the entire table from publication as Rahila is proposing.

I think removing just a particular column can break the replication
for Updates and Deletes if the removed column is part of replica
identity. If the entire filter is removed then also the entire
replication can break, so, I think Rahila's proposal is worth
considering.

> 2. If it's DROP RESTRICT, then an error should be raised so that the
> user can make a concious decision to remove the column from the filter
> before dropping the column.
>

I think one can argue for a similar case for index. If we are allowing
the index to be dropped even with RESTRICT then why not column filter?

> > Did you give any thoughts to my earlier suggestion related to syntax [1]?
> >
> > [1] https://www.postgresql.org/message-id/CAA4eK1J9b_0_PMnJ2jq9E55bcbmTKdUmy6jPnkf1Zwy2jxah_g%40mail.gmail.com
>
> This is a great followup idea, after the current feature is committed.
>

As mentioned in my response to Rahila, I was just thinking of using an
optional keyword Column for column filter so that we can extend it
later.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Data loss when '"json_populate_recorset" with long column name
Next
From: Dilip Kumar
Date:
Subject: Re: [BUG] Failed Assertion in ReorderBufferChangeMemoryUpdate()