Thread: Add a GUC variable that control logical replication

Add a GUC variable that control logical replication

From
Quan Zongliang
Date:

Sybase has a feature to turn off replication at the session level: set 
replication = off, which can be temporarily turned off when there is a 
maintenance action on the table. Our users also want this feature.
I add a new flag bit in xinfo, control it with a session-level variable, 
when set to true, this flag is written when the transaction is 
committed, and when the logic is decoded it abandons the transaction 
like aborted transactions. Since PostgreSQL has two types of 
replication, I call the variable "logical_replication" to avoid 
confusion and default value is true.

Sample SQL

insert into a values(100);
set logical_replication to off;
insert into a values(200);
reset logical_replication;
insert into a values(300);

pg_recvlogical output(the second is not output.)
BEGIN 492
table public.a: INSERT: col1[integer]:100
COMMIT 492
BEGIN 494
table public.a: INSERT: col1[integer]:300
COMMIT 494

I'm not sure this is the most appropriate way. What do you think?

Regards,
Quan Zongliang

Attachment

Re: Add a GUC variable that control logical replication

From
Peter Eisentraut
Date:
On 2019-09-18 10:39, Quan Zongliang wrote:
> Sybase has a feature to turn off replication at the session level: set 
> replication = off, which can be temporarily turned off when there is a 
> maintenance action on the table. Our users also want this feature.

These kinds of feature requests are always dubious because just because
Sybase behaves this way for some implementation or architectural reason
doesn't necessarily mean it makes sense for PostgreSQL too.

Why do you need to turn off replication when there is "maintenance" on a
table?  What does that even mean?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add a GUC variable that control logical replication

From
Quan Zongliang
Date:
On 2019/9/18 17:11, Peter Eisentraut wrote:
> On 2019-09-18 10:39, Quan Zongliang wrote:
>> Sybase has a feature to turn off replication at the session level: set
>> replication = off, which can be temporarily turned off when there is a
>> maintenance action on the table. Our users also want this feature.
> 
> These kinds of feature requests are always dubious because just because
> Sybase behaves this way for some implementation or architectural reason
> doesn't necessarily mean it makes sense for PostgreSQL too.
> 
Agree
> Why do you need to turn off replication when there is "maintenance" on a
> table?  What does that even mean?
> 
In a table, the user only keep data for a period of time and delete 
expired records every day, involving about 10 million to 20 million 
records at a time. They want to not pass similar bulk operations in 
logical replication.
My English is bad that I use the wrong word “maintenance” in my description.




Re: Add a GUC variable that control logical replication

From
Peter Eisentraut
Date:
On 2019-09-18 11:33, Quan Zongliang wrote:
> On 2019/9/18 17:11, Peter Eisentraut wrote:
>> Why do you need to turn off replication when there is "maintenance" on a
>> table?  What does that even mean?
>>
> In a table, the user only keep data for a period of time and delete 
> expired records every day, involving about 10 million to 20 million 
> records at a time. They want to not pass similar bulk operations in 
> logical replication.

You can probably achieve that using ALTER PUBLICATION to disable
publication of deletes or truncates, as the case may be, either
permanently or just for the duration of the operations you want to skip.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add a GUC variable that control logical replication

From
Euler Taveira
Date:
Em sáb, 19 de out de 2019 às 14:11, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> escreveu:
>
> On 2019-09-18 11:33, Quan Zongliang wrote:
> > On 2019/9/18 17:11, Peter Eisentraut wrote:
> >> Why do you need to turn off replication when there is "maintenance" on a
> >> table?  What does that even mean?
> >>
> > In a table, the user only keep data for a period of time and delete
> > expired records every day, involving about 10 million to 20 million
> > records at a time. They want to not pass similar bulk operations in
> > logical replication.
>
> You can probably achieve that using ALTER PUBLICATION to disable
> publication of deletes or truncates, as the case may be, either
> permanently or just for the duration of the operations you want to skip.
>
... then you are skipping all tables in the publication. I think this
feature is not essential for unidirectional logical replication.
However, it is important for multi-master replication. Data
synchronization tool will generate transactions with rows that are
already in the other node(s) so those transactions can't be replicated
to avoid (expensive) conflicts.


--
   Euler Taveira                                   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento



Re: Add a GUC variable that control logical replication

From
Peter Eisentraut
Date:
On 2019-10-20 00:23, Euler Taveira wrote:
>> You can probably achieve that using ALTER PUBLICATION to disable
>> publication of deletes or truncates, as the case may be, either
>> permanently or just for the duration of the operations you want to skip.
>>
> ... then you are skipping all tables in the publication.

You can group tables into different publications and set the 
subscription to subscribe to multiple publications if you need this kind 
of granularity.

In any case, this kind of thing needs to be handled by the decoding 
plugin based on its configuration policies and depending on its needs. 
For example, let's say you have two decoding plugins running: one for a 
replication system and one for writing an audit log.  It would not be 
appropriate to disable logging for both of them because of some 
performance optimization for one of them.  And it would also not be 
appropriate to do this with a USERSET setting.

If we need different hooks or more DDL commands do this better, then 
that can be considered.  But this seems to be the wrong way to do it.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add a GUC variable that control logical replication

From
Quan Zongliang
Date:
On 2019/11/1 20:49, Peter Eisentraut wrote:
> On 2019-10-20 00:23, Euler Taveira wrote:
>>> You can probably achieve that using ALTER PUBLICATION to disable
>>> publication of deletes or truncates, as the case may be, either
>>> permanently or just for the duration of the operations you want to skip.
>>>
>> ... then you are skipping all tables in the publication.
> 
> You can group tables into different publications and set the 
> subscription to subscribe to multiple publications if you need this kind 
> of granularity.
> 
> In any case, this kind of thing needs to be handled by the decoding 
> plugin based on its configuration policies and depending on its needs. 
> For example, let's say you have two decoding plugins running: one for a 
> replication system and one for writing an audit log.  It would not be 
> appropriate to disable logging for both of them because of some 
> performance optimization for one of them.  And it would also not be 
> appropriate to do this with a USERSET setting.
> 
> If we need different hooks or more DDL commands do this better, then 
> that can be considered.  But this seems to be the wrong way to do it.
> 

What the user needs is the same replication link that selectively skips 
some transactions. And this choice only affects transactions that are 
doing bulk delete sessions. The operations of other sessions are not 
affected and can continue to output replication messages.
For example, session 1 wants to bulk delete 1 million old data from the 
T1 table, which can be done without replication. At the same time, 
session 2 deletes 10 records from T1, which is expected to be passed on 
through replication.
Therefore, the two decoders can not meet this requirement. It is also 
inappropriate to temporarily disable subscriptions because it skips all 
transactions for a certain period of time.

-- 
权宗亮
神州飞象(北京)数据科技有限公司
我们的力量源自最先进的开源数据库PostgreSQL
zongliang.quan@postgresdata.com




Re: Add a GUC variable that control logical replication

From
Craig Ringer
Date:
On Wed, 18 Sep 2019 at 16:39, Quan Zongliang <zongliang.quan@postgresdata.com> wrote:

Sybase has a feature to turn off replication at the session level: set
replication = off, which can be temporarily turned off when there is a
maintenance action on the table. Our users also want this feature.
I add a new flag bit in xinfo, control it with a session-level variable,
when set to true, this flag is written when the transaction is
committed, and when the logic is decoded it abandons the transaction
like aborted transactions. Since PostgreSQL has two types of
replication, I call the variable "logical_replication" to avoid
confusion and default value is true.

There's something related to this already. You can set the replication origin for the transaction to the special value DoNotReplicateId (replication origin id 65535). This will suppress replication of the transaction, at least for output plugins that're aware of replication origins.

This isn't presently exposed to SQL, it's there for the use of logical replication extensions. It's possible to expose it with a pretty trivial C function in an extension.

I think it's a bit of a hack TBH, it's something I perpetrated sometime in the 9.4 series when we needed a way to suppress replication of individual transactions. It originated out of core, so the original design was constrained in how it worked, and maybe it would've actually made more sense to use an xlinfo flag. Probably not worth changing now though.

Be extremely careful though. If you're hiding things from logical replication you can get all sorts of confusing and exciting results. I very strongly suggest you make anything like this superuser-only.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise

Re: Add a GUC variable that control logical replication

From
Michael Paquier
Date:
On Wed, Nov 06, 2019 at 10:01:43PM +0800, Quan Zongliang wrote:
> What the user needs is the same replication link that selectively skips some
> transactions. And this choice only affects transactions that are doing bulk
> delete sessions. The operations of other sessions are not affected and can
> continue to output replication messages.
> For example, session 1 wants to bulk delete 1 million old data from the T1
> table, which can be done without replication. At the same time, session 2
> deletes 10 records from T1, which is expected to be passed on through
> replication.
> Therefore, the two decoders can not meet this requirement. It is also
> inappropriate to temporarily disable subscriptions because it skips all
> transactions for a certain period of time.

Hmm.  The patch discussed on this thread does not have much support
from Peter and Craig, so I am marking it as RwF.
--
Michael

Attachment

Re: Add a GUC variable that control logical replication

From
Craig Ringer
Date:
On Thu, 28 Nov 2019 at 11:53, Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Nov 06, 2019 at 10:01:43PM +0800, Quan Zongliang wrote:
> What the user needs is the same replication link that selectively skips some
> transactions. And this choice only affects transactions that are doing bulk
> delete sessions. The operations of other sessions are not affected and can
> continue to output replication messages.
> For example, session 1 wants to bulk delete 1 million old data from the T1
> table, which can be done without replication. At the same time, session 2
> deletes 10 records from T1, which is expected to be passed on through
> replication.
> Therefore, the two decoders can not meet this requirement. It is also
> inappropriate to temporarily disable subscriptions because it skips all
> transactions for a certain period of time.

Hmm.  The patch discussed on this thread does not have much support
from Peter and Craig, so I am marking it as RwF.


Yeah. I'm not against it as such. But I'd like to either see it work by exposing the ability to use DoNotReplicateId to SQL or if that's not satisfactory, potentially replace that mechanism with the newly added one and emulate DoNotReplicateId for BC.

I don't want two orthogonal ways to say "don't consider this for logical replication".
--
 Craig Ringer                   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise