Thread: Reducing bandwidth usage of database replication

Reducing bandwidth usage of database replication

From
Sascha Zenglein
Date:
Hi all,

I want to use the postgres-native logical replication to have multiple clients receive and send data to a central database.
Real-time is far less important than network usage, and with my current test setup it appears both instances communicate frequently if a subscription is active, even if nothing is happening.

Is there a good way to reduce data usage, for example by limiting the amount of keep-alive messages? One database will likely be idle most of the time.

I estimated the current solution to idle at around 1.4MiB per day. Ideally it would use less than 100KiB a day.

I'm also open for other solutions if anything comes to mind!

Thanks for the help!

Sascha Zenglein

Produktentwicklung

 
 

Gessler GmbH
Gutenbergring 14
63110 Rodgau
Deutschland
Tel.:  +49 6106 8709 693
Fax:  +49 6106 8709 50
 
E-Mail: zenglein@gessler.de
Web: www.gessler.de

 
Geschaeftsfuehrer: Helmut Gessler, Dipl.-Ing. Marcus Gessler
Gerichtsstand: Offenbach/Main, Amtsgericht Offenbach HRB 20857
USt.-IdNr.: DE 113 551 141
Sparkasse Dieburg, IBAN DE94 5085 2651 0057 0025 03, SWIFT HELADEF1DIE


Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtuemlich erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Mail.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail
und/oder der Inhalte dieser Mail ist nicht gestattet.

This e-mail may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail.
Any unauthorised copying, disclosure or distribution of the material of this e-mail is strictly forbidden.
Attachment

Re: Reducing bandwidth usage of database replication

From
Ron
Date:
On 11/2/22 09:56, Sascha Zenglein wrote:
P {margin-top:0;margin-bottom:0;}
Hi all,

I want to use the postgres-native logical replication to have multiple clients receive and send data to a central database.
Real-time is far less important than network usage, and with my current test setup it appears both instances communicate frequently if a subscription is active, even if nothing is happening.

Is there a good way to reduce data usage, for example by limiting the amount of keep-alive messages? One database will likely be idle most of the time.

I estimated the current solution to idle at around 1.4MiB per day. Ideally it would use less than 100KiB a day.

1.4MiB/day is 17 bytes per second.  That's not too much.

--
Angular momentum makes the world go 'round.

AW: Reducing bandwidth usage of database replication

From
Sascha Zenglein
Date:
It is, unfortunately, too much for our requirements.
I was able to increase wal_receiver_status_interval so that the publisher only contacts the subscriber every 30 seconds instead of every 10.
But I am not able to increase it further.

Sascha Zenglein

Produktentwicklung

 
 

Gessler GmbH
Gutenbergring 14
63110 Rodgau
Deutschland
Tel.:  +49 6106 8709 693
Fax:  +49 6106 8709 50
 
E-Mail: zenglein@gessler.de
Web: www.gessler.de

 
Geschaeftsfuehrer: Helmut Gessler, Dipl.-Ing. Marcus Gessler
Gerichtsstand: Offenbach/Main, Amtsgericht Offenbach HRB 20857
USt.-IdNr.: DE 113 551 141
Sparkasse Dieburg, IBAN DE94 5085 2651 0057 0025 03, SWIFT HELADEF1DIE


Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtuemlich erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Mail.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail
und/oder der Inhalte dieser Mail ist nicht gestattet.

This e-mail may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail.
Any unauthorised copying, disclosure or distribution of the material of this e-mail is strictly forbidden.

Von: Ron <ronljohnsonjr@gmail.com>
Gesendet: Mittwoch, 2. November 2022 17:29
An: pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Betreff: Re: Reducing bandwidth usage of database replication
 
On 11/2/22 09:56, Sascha Zenglein wrote:
Hi all,

I want to use the postgres-native logical replication to have multiple clients receive and send data to a central database.
Real-time is far less important than network usage, and with my current test setup it appears both instances communicate frequently if a subscription is active, even if nothing is happening.

Is there a good way to reduce data usage, for example by limiting the amount of keep-alive messages? One database will likely be idle most of the time.

I estimated the current solution to idle at around 1.4MiB per day. Ideally it would use less than 100KiB a day.

1.4MiB/day is 17 bytes per second.  That's not too much.

--
Angular momentum makes the world go 'round.
Attachment

Re: Reducing bandwidth usage of database replication

From
Ben Chobot
Date:
Sascha Zenglein wrote on 11/2/22 7:56 AM:
Hi all,

I want to use the postgres-native logical replication to have multiple clients receive and send data to a central database.
Real-time is far less important than network usage, and with my current test setup it appears both instances communicate frequently if a subscription is active, even if nothing is happening.

Is there a good way to reduce data usage, for example by limiting the amount of keep-alive messages? One database will likely be idle most of the time.

I estimated the current solution to idle at around 1.4MiB per day. Ideally it would use less than 100KiB a day.

I'm also open for other solutions if anything comes to mind!

It sounds like you are trying to use logical replication to give yourself a multi-master database setup, and that you've squeezed as much optimization as you can from logical replication and found it to be unworkable. If that's a fair assessment, you might look into something like bucardo instead. I haven't done the network comparison but it is a different solution that might meet your goals.

Re: Reducing bandwidth usage of database replication

From
Simon Riggs
Date:
On Thu, 3 Nov 2022 at 08:40, Sascha Zenglein <zenglein@gessler.de> wrote:
It is, unfortunately, too much for our requirements.
I was able to increase wal_receiver_status_interval so that the publisher only contacts the subscriber every 30 seconds instead of every 10.
But I am not able to increase it further.

The max value is much higher than 30. Why can you not increase it further?

--
Simon Riggs                http://www.EnterpriseDB.com/

Re: Reducing bandwidth usage of database replication

From
Simon Riggs
Date:
On Thu, 3 Nov 2022 at 15:28, Ben Chobot <bench@silentmedia.com> wrote:

> I want to use the postgres-native logical replication to have multiple clients receive and send data to a central
database.
> Real-time is far less important than network usage, and with my current test setup it appears both instances
communicatefrequently if a subscription is active, even if nothing is happening. 
>
> Is there a good way to reduce data usage, for example by limiting the amount of keep-alive messages? One database
willlikely be idle most of the time. 

Keep-alive messages are the largest concern on an otherwise quiet link.

Probably makes sense to let the connection time out, then set a large
wal_retrieve_retry_interval, so it doesn't immediately reconnect.


> It sounds like you are trying to use logical replication to give yourself a multi-master database setup, and that
you'vesqueezed as much optimization as you can from logical replication and found it to be unworkable. If that's a fair
assessment,you might look into something like bucardo instead. I haven't done the network comparison but it is a
differentsolution that might meet your goals. 

The overhead of Bucardo has been measured as being more than 3x that
of logical replication.

--
Simon Riggs                http://www.EnterpriseDB.com/



AW: Reducing bandwidth usage of database replication

From
Sascha Zenglein
Date:
I have set the status interval to 600s and it still sends a status message every 30s.
I also tried setting every available option higher, with no success:

checkpoint_warning = 93s
max_standby_streaming_delay = 130s
max_standby_archive_delay = 45s
wal_receiver_status_interval = 600s
wal_receiver_timeout = 1200s
wal_receiver_timeout = 65s
recovery_min_apply_delay = 600s

The random values were to see which setting is limiting if I got above the 30s limit.

Sascha Zenglein

Produktentwicklung

 
 

Gessler GmbH
Gutenbergring 14
63110 Rodgau
Deutschland
Tel.:  +49 6106 8709 693
Fax:  +49 6106 8709 50
 
E-Mail: zenglein@gessler.de
Web: www.gessler.de

 
Geschaeftsfuehrer: Helmut Gessler, Dipl.-Ing. Marcus Gessler
Gerichtsstand: Offenbach/Main, Amtsgericht Offenbach HRB 20857
USt.-IdNr.: DE 113 551 141
Sparkasse Dieburg, IBAN DE94 5085 2651 0057 0025 03, SWIFT HELADEF1DIE


Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtuemlich erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Mail.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail
und/oder der Inhalte dieser Mail ist nicht gestattet.

This e-mail may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail.
Any unauthorised copying, disclosure or distribution of the material of this e-mail is strictly forbidden.

Von: Simon Riggs <simon.riggs@enterprisedb.com>
Gesendet: Freitag, 4. November 2022 10:16
An: Sascha Zenglein <zenglein@gessler.de>
Cc: pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Betreff: Re: Reducing bandwidth usage of database replication
 
On Thu, 3 Nov 2022 at 08:40, Sascha Zenglein <zenglein@gessler.de> wrote:
It is, unfortunately, too much for our requirements.
I was able to increase wal_receiver_status_interval so that the publisher only contacts the subscriber every 30 seconds instead of every 10.
But I am not able to increase it further.

The max value is much higher than 30. Why can you not increase it further?

--
Simon Riggs                http://www.EnterpriseDB.com/
Attachment

Re: AW: Reducing bandwidth usage of database replication

From
Kyotaro Horiguchi
Date:
At Fri, 4 Nov 2022 09:25:44 +0000, Sascha Zenglein <zenglein@gessler.de> wrote in 
> I have set the status interval to 600s and it still sends a status message every 30s.
> I also tried setting every available option higher, with no success:

I guess you are not setting wal_sender_timeout on the primary side.

Status messages are also sent in response to sender pings that is
controlled by that variable.  Wal sender sends a ping after a half of
that setting since the last status message to request walreceiver to
send a response.

> checkpoint_warning = 93s
> max_standby_streaming_delay = 130s
> max_standby_archive_delay = 45s
> wal_receiver_status_interval = 600s
> wal_receiver_timeout = 1200s
> wal_receiver_timeout = 65s
> recovery_min_apply_delay = 600s
> 
> The random values were to see which setting is limiting if I got above the 30s limit.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



AW: AW: Reducing bandwidth usage of database replication

From
Sascha Zenglein
Date:
Thanks, the wal_sender_timeout setting was the culprit!
A bit hard to find that it influences a ping at half its interval time though.

Sascha Zenglein

Produktentwicklung

 
 

Gessler GmbH
Gutenbergring 14
63110 Rodgau
Deutschland
Tel.:  +49 6106 8709 693
Fax:  +49 6106 8709 50
 
E-Mail: zenglein@gessler.de
Web: www.gessler.de

 
Geschaeftsfuehrer: Helmut Gessler, Dipl.-Ing. Marcus Gessler
Gerichtsstand: Offenbach/Main, Amtsgericht Offenbach HRB 20857
USt.-IdNr.: DE 113 551 141
Sparkasse Dieburg, IBAN DE94 5085 2651 0057 0025 03, SWIFT HELADEF1DIE


Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtuemlich erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Mail.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail
und/oder der Inhalte dieser Mail ist nicht gestattet.

This e-mail may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail.
Any unauthorised copying, disclosure or distribution of the material of this e-mail is strictly forbidden.

Von: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Gesendet: Montag, 7. November 2022 06:21
An: Sascha Zenglein <zenglein@gessler.de>
Cc: simon.riggs@enterprisedb.com <simon.riggs@enterprisedb.com>; pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Betreff: Re: AW: Reducing bandwidth usage of database replication
 
At Fri, 4 Nov 2022 09:25:44 +0000, Sascha Zenglein <zenglein@gessler.de> wrote in
> I have set the status interval to 600s and it still sends a status message every 30s.
> I also tried setting every available option higher, with no success:

I guess you are not setting wal_sender_timeout on the primary side.

Status messages are also sent in response to sender pings that is
controlled by that variable.  Wal sender sends a ping after a half of
that setting since the last status message to request walreceiver to
send a response.

> checkpoint_warning = 93s
> max_standby_streaming_delay = 130s
> max_standby_archive_delay = 45s
> wal_receiver_status_interval = 600s
> wal_receiver_timeout = 1200s
> wal_receiver_timeout = 65s
> recovery_min_apply_delay = 600s
>
> The random values were to see which setting is limiting if I got above the 30s limit.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment