Thread: Restore replication settings when modifying a field type

Restore replication settings when modifying a field type

From
Quan Zongliang
Date:
When the user modifies the REPLICA IDENTIFY field type, the logical 
replication settings are lost.

For example:

postgres=# \d+ t1
                                     Table "public.t1"
  Column |  Type   | Collation | Nullable | Default | Storage | Stats 
target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
  col1   | integer |           |          |         | plain   | 
     |
  col2   | integer |           | not null |         | plain   | 
     |
Indexes:
     "t1_col2_key" UNIQUE CONSTRAINT, btree (col2) REPLICA IDENTITY


postgres=# alter table t1 alter col2 type smallint;
ALTER TABLE
postgres=# \d+ t1
                                      Table "public.t1"
  Column |   Type   | Collation | Nullable | Default | Storage | Stats 
target | Description
--------+----------+-----------+----------+---------+---------+--------------+-------------
  col1   | integer  |           |          |         | plain   | 
      |
  col2   | smallint |           | not null |         | plain   | 
      |
Indexes:
     "t1_col2_key" UNIQUE CONSTRAINT, btree (col2)

In fact, the replication property of the table has not been modified, 
and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously 
specified index property 'indisreplident' is set to false because of the 
rebuild.

So I developed a patch. If the user modifies the field type. The 
associated index is REPLICA IDENTITY. Rebuild and restore replication 
settings.

Regards,
Quan Zongliang

Attachment

Re: Restore replication settings when modifying a field type

From
Kyotaro Horiguchi
Date:
Hello.

# The patch no longer applies on the current master. Needs a rebasing.

At Sat, 26 Oct 2019 16:50:48 +0800, Quan Zongliang <quanzongliang@gmail.com> wrote in 
> In fact, the replication property of the table has not been modified,
> and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously
> specified index property 'indisreplident' is set to false because of
> the rebuild.

I suppose that the behavior is intended. Change of column types on the
publisher side can break the agreement on replica identity with
subscribers. Thus replica identity setting cannot be restored
unconditionally. For (somewhat artifitial :p) example:

P=# create table t (c1 integer, c2 text unique not null);
P=# alter table t replica identity using index t_c2_key;
P=# create publication p1 for table t;
P=# insert into t values (0, '00'), (1, '01');
S=# create table t (c1 integer, c2 text unique not null);
S=# alter table t replica identity using index t_c2_key;
S=# create subscription s1 connection '...' publication p1;

Your patch allows change of the type of c2 into integer.

P=# alter table t alter column c2 type integer using c2::integer;
P=# update t set c1 = c1 + 1 where c2 = '01';

This change doesn't affect perhaps as expected.

S=# select * from t;
 c1 | c2 
----+----
  0 | 00
  1 | 01
(2 rows)


> So I developed a patch. If the user modifies the field type. The
> associated index is REPLICA IDENTITY. Rebuild and restore replication
> settings.

Explicit setting of replica identity premises that they are sure that
the setting works correctly. Implicit rebuilding after a type change
can silently break it.

At least we need to guarantee that the restored replica identity
setting is truly compatible with all existing subscribers. I'm not
sure about potential subscribers..

Anyway I think it is a problem that replica identity setting is
dropped silently. Perhaps a message something like "REPLICA IDENTITY
setting is lost, please redefine after confirmation of compatibility
with subscribers." is needed.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Restore replication settings when modifying a field type

From
Quan Zongliang
Date:
On 2019/10/28 12:39, Kyotaro Horiguchi wrote:
> Hello.
> 
> # The patch no longer applies on the current master. Needs a rebasing.
> 
> At Sat, 26 Oct 2019 16:50:48 +0800, Quan Zongliang <quanzongliang@gmail.com> wrote in
>> In fact, the replication property of the table has not been modified,
>> and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously
>> specified index property 'indisreplident' is set to false because of
>> the rebuild.
> 
> I suppose that the behavior is intended. Change of column types on the
> publisher side can break the agreement on replica identity with
> subscribers. Thus replica identity setting cannot be restored
> unconditionally. For (somewhat artifitial :p) example:
> 
> P=# create table t (c1 integer, c2 text unique not null);
> P=# alter table t replica identity using index t_c2_key;
> P=# create publication p1 for table t;
> P=# insert into t values (0, '00'), (1, '01');
> S=# create table t (c1 integer, c2 text unique not null);
> S=# alter table t replica identity using index t_c2_key;
> S=# create subscription s1 connection '...' publication p1;
> 
> Your patch allows change of the type of c2 into integer.
> 
> P=# alter table t alter column c2 type integer using c2::integer;
> P=# update t set c1 = c1 + 1 where c2 = '01';
> 
> This change doesn't affect perhaps as expected.
> 
> S=# select * from t;
>   c1 | c2
> ----+----
>    0 | 00
>    1 | 01
> (2 rows)
> 
> 
>> So I developed a patch. If the user modifies the field type. The
>> associated index is REPLICA IDENTITY. Rebuild and restore replication
>> settings.
> 
> Explicit setting of replica identity premises that they are sure that
> the setting works correctly. Implicit rebuilding after a type change
> can silently break it.
> 
> At least we need to guarantee that the restored replica identity
> setting is truly compatible with all existing subscribers. I'm not
> sure about potential subscribers..
> 
> Anyway I think it is a problem that replica identity setting is
> dropped silently. Perhaps a message something like "REPLICA IDENTITY
> setting is lost, please redefine after confirmation of compatibility
> with subscribers." is needed.
> 
In fact, the scene we encountered is like this. The field of a user's 
table is of type "smallint", and it turns out that this range is not 
sufficient. So change it to "int". At this point, the REPLICA IDENTITY 
is lost and the user does not realize it. When they found out, the 
logical replication for this period of time did not output normally. 
Users have to find other ways to get the data back.
The logical replication of this user is to issue standard SQL statements 
to other relational databases using the plugin developed by himself. And 
they have thousands of tables to replicate.
So I think this patch is appropriate in this scenario. As for the 
matching problem between publishers and subscribers, I'm afraid it's 
hard to solve here. If this is not a suitable modification, I can 
withdraw it. And see if there's a better way.

If necessary, I'll modify it again. Rebase to the master branch.

> regards.
> 




Re: Restore replication settings when modifying a field type

From
Euler Taveira
Date:
Em seg, 28 de out de 2019 às 01:41, Kyotaro Horiguchi
<horikyota.ntt@gmail.com> escreveu:
>
> At Sat, 26 Oct 2019 16:50:48 +0800, Quan Zongliang <quanzongliang@gmail.com> wrote in
> > In fact, the replication property of the table has not been modified,
> > and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously
> > specified index property 'indisreplident' is set to false because of
> > the rebuild.
>
> I suppose that the behavior is intended. Change of column types on the
> publisher side can break the agreement on replica identity with
> subscribers. Thus replica identity setting cannot be restored
> unconditionally. For (somewhat artifitial :p) example:
>
I don't think so. The actual logical replication behavior is that DDL
will always break replication. If you add a new column or drop a
column, you will stop replication for that table while you don't
execute the same DDL in the subscriber. What happens in the OP case is
that a DDL is *silently* breaking the logical replication. IMHO it is
a bug. If the behavior was intended it should clean
pg_class.relreplident but it is not.

> Explicit setting of replica identity premises that they are sure that
> the setting works correctly. Implicit rebuilding after a type change
> can silently break it.
>
The current behavior forces the OP to execute 2 DDLs in the same
transaction to ensure that it won't "loose" transactions for logical
replication.

> At least we need to guarantee that the restored replica identity
> setting is truly compatible with all existing subscribers. I'm not
> sure about potential subscribers..
>
Why? Replication will break and to fix it you should apply the same
DDL you apply in publisher. It is the right thing to do.

[poking the code...]

ATExecAlterColumnType records everything that depends on the column
and for indexes it saves the definition (via pg_get_indexdef_string).
Definition is not sufficient for reconstructing the replica identity
information because there is not such keyword for replica identity in
CREATE INDEX. The new index should call relation_mark_replica_identity
to fix pg_index.indisreplident.


--
   Euler Taveira                                   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento



Re: Restore replication settings when modifying a field type

From
Quan Zongliang
Date:
On 2019/10/28 12:39, Kyotaro Horiguchi wrote:
> Hello.
> 
> # The patch no longer applies on the current master. Needs a rebasing.
> 
New patch, rebased on master branch.

> At Sat, 26 Oct 2019 16:50:48 +0800, Quan Zongliang <quanzongliang@gmail.com> wrote in
>> In fact, the replication property of the table has not been modified,
>> and it is still 'i'(REPLICA_IDENTITY_INDEX). But the previously
>> specified index property 'indisreplident' is set to false because of
>> the rebuild.
> 
> I suppose that the behavior is intended. Change of column types on the
> publisher side can break the agreement on replica identity with
> subscribers. Thus replica identity setting cannot be restored
> unconditionally. For (somewhat artifitial :p) example:
> 
> P=# create table t (c1 integer, c2 text unique not null);
> P=# alter table t replica identity using index t_c2_key;
> P=# create publication p1 for table t;
> P=# insert into t values (0, '00'), (1, '01');
> S=# create table t (c1 integer, c2 text unique not null);
> S=# alter table t replica identity using index t_c2_key;
> S=# create subscription s1 connection '...' publication p1;
> 
> Your patch allows change of the type of c2 into integer.
> 
> P=# alter table t alter column c2 type integer using c2::integer;
> P=# update t set c1 = c1 + 1 where c2 = '01';
> 
> This change doesn't affect perhaps as expected.
> 
> S=# select * from t;
>   c1 | c2
> ----+----
>    0 | 00
>    1 | 01
> (2 rows)
> 
> 
>> So I developed a patch. If the user modifies the field type. The
>> associated index is REPLICA IDENTITY. Rebuild and restore replication
>> settings.
> 
> Explicit setting of replica identity premises that they are sure that
> the setting works correctly. Implicit rebuilding after a type change
> can silently break it.
> 
> At least we need to guarantee that the restored replica identity
> setting is truly compatible with all existing subscribers. I'm not
> sure about potential subscribers..
> 
> Anyway I think it is a problem that replica identity setting is
> dropped silently. Perhaps a message something like "REPLICA IDENTITY
> setting is lost, please redefine after confirmation of compatibility
> with subscribers." is needed.
> 
> regards.
> 


Attachment

Re: Restore replication settings when modifying a field type

From
Peter Eisentraut
Date:
On 2019-11-01 04:39, Euler Taveira wrote:
> ATExecAlterColumnType records everything that depends on the column
> and for indexes it saves the definition (via pg_get_indexdef_string).
> Definition is not sufficient for reconstructing the replica identity
> information because there is not such keyword for replica identity in
> CREATE INDEX. The new index should call relation_mark_replica_identity
> to fix pg_index.indisreplident.

Yeah, I don't think we need to do the full dance of reverse compiling 
the SQL command and reexecuting it, as the patch currently does.  That's 
only necessary for rebuilding the index itself.  For re-setting the 
replica identity, we can just use the internal API as you say.

Also, a few test cases would be nice for this patch.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Restore replication settings when modifying a field type

From
Quan Zongliang
Date:
On 2020/1/3 17:14, Peter Eisentraut wrote:
> On 2019-11-01 04:39, Euler Taveira wrote:
>> ATExecAlterColumnType records everything that depends on the column
>> and for indexes it saves the definition (via pg_get_indexdef_string).
>> Definition is not sufficient for reconstructing the replica identity
>> information because there is not such keyword for replica identity in
>> CREATE INDEX. The new index should call relation_mark_replica_identity
>> to fix pg_index.indisreplident.
> 
> Yeah, I don't think we need to do the full dance of reverse compiling 
> the SQL command and reexecuting it, as the patch currently does.  That's 
> only necessary for rebuilding the index itself.  For re-setting the 
> replica identity, we can just use the internal API as you say.
> 
> Also, a few test cases would be nice for this patch.
> 

I'm a little busy. I'll write a new patch in a few days.



Re: Restore replication settings when modifying a field type

From
Quan Zongliang
Date:
On 2020/1/3 17:14, Peter Eisentraut wrote:
> On 2019-11-01 04:39, Euler Taveira wrote:
>> ATExecAlterColumnType records everything that depends on the column
>> and for indexes it saves the definition (via pg_get_indexdef_string).
>> Definition is not sufficient for reconstructing the replica identity
>> information because there is not such keyword for replica identity in
>> CREATE INDEX. The new index should call relation_mark_replica_identity
>> to fix pg_index.indisreplident.
> 
> Yeah, I don't think we need to do the full dance of reverse compiling 
> the SQL command and reexecuting it, as the patch currently does.  That's 
> only necessary for rebuilding the index itself.  For re-setting the 
> replica identity, we can just use the internal API as you say.
> 
> Also, a few test cases would be nice for this patch.
> 

I'm a little busy. I'll write a new patch in a few days.



Re: Restore replication settings when modifying a field type

From
Quan Zongliang
Date:
On 2020/1/15 08:30, Quan Zongliang wrote:
> On 2020/1/3 17:14, Peter Eisentraut wrote:
>> On 2019-11-01 04:39, Euler Taveira wrote:
>>> ATExecAlterColumnType records everything that depends on the column
>>> and for indexes it saves the definition (via pg_get_indexdef_string).
>>> Definition is not sufficient for reconstructing the replica identity
>>> information because there is not such keyword for replica identity in
>>> CREATE INDEX. The new index should call relation_mark_replica_identity
>>> to fix pg_index.indisreplident.
>>
>> Yeah, I don't think we need to do the full dance of reverse compiling 
>> the SQL command and reexecuting it, as the patch currently does.  
>> That's only necessary for rebuilding the index itself.  For re-setting 
>> the replica identity, we can just use the internal API as you say.
>>
>> Also, a few test cases would be nice for this patch.
>>
> 
> I'm a little busy. I'll write a new patch in a few days.

new patch attached.


test case 1:
create table t1 (col1 int,col2 int not null,
   col3 int not null,unique(col2,col3));
alter table t1 replica identity using index t1_col2_col3_key;
alter table t1 alter col3 type text;
alter table t1 alter col3 type smallint using col3::int;

test case 2:
create table t2 (col1 varchar(10), col2 text not null,
   col3 timestamp not null,unique(col2,col3),
   col4 int not null unique);
alter table t2 replica identity using index t2_col2_col3_key;
alter table t2 alter col3 type text;
alter table t2 replica identity using index t2_col4_key;
alter table t2 alter col4 type timestamp using '2020-02-11'::timestamp;


Attachment

Re: Restore replication settings when modifying a field type

From
Peter Eisentraut
Date:
On 2020-02-11 00:38, Quan Zongliang wrote:
> new patch attached.

I didn't like so much how the updating of the replica identity was 
hacked into the middle of ATRewriteCatalogs().  I have an alternative 
proposal in the attached patch that queues up an ALTER TABLE ... REPLICA 
IDENTITY command into the normal ALTER TABLE processing.  I have also 
added tests to the test suite.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Restore replication settings when modifying a field type

From
Euler Taveira
Date:
On Thu, 5 Mar 2020 at 09:45, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 2020-02-11 00:38, Quan Zongliang wrote:
> new patch attached.

I didn't like so much how the updating of the replica identity was
hacked into the middle of ATRewriteCatalogs().  I have an alternative
proposal in the attached patch that queues up an ALTER TABLE ... REPLICA
IDENTITY command into the normal ALTER TABLE processing.  I have also
added tests to the test suite.

LGTM. Tests are ok. I've rebased it (because 61d7c7bce3686ec02bd64abac742dd35ed9b9b01). Are you planning to backpatch it? IMHO you should because it is a bug (since REPLICA IDENTITY was introduced in 9.4). This patch can be applied as-is in 12 but not to other older branches. I attached new patches.

Regards,


--
Euler Taveira                 http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment

Re: Restore replication settings when modifying a field type

From
Peter Eisentraut
Date:
On 2020-03-10 14:16, Euler Taveira wrote:
> On Thu, 5 Mar 2020 at 09:45, Peter Eisentraut 
> <peter.eisentraut@2ndquadrant.com 
> <mailto:peter.eisentraut@2ndquadrant.com>> wrote:
> 
>     On 2020-02-11 00:38, Quan Zongliang wrote:
>      > new patch attached.
> 
>     I didn't like so much how the updating of the replica identity was
>     hacked into the middle of ATRewriteCatalogs().  I have an alternative
>     proposal in the attached patch that queues up an ALTER TABLE ...
>     REPLICA
>     IDENTITY command into the normal ALTER TABLE processing.  I have also
>     added tests to the test suite.
> 
> LGTM. Tests are ok. I've rebased it (because 
> 61d7c7bce3686ec02bd64abac742dd35ed9b9b01). Are you planning to backpatch 
> it? IMHO you should because it is a bug (since REPLICA IDENTITY was 
> introduced in 9.4). This patch can be applied as-is in 12 but not to 
> other older branches. I attached new patches.

Thanks.  This has been committed and backpatched to 9.5.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services