Thread: rsync and streaming replication
Hi all,
I am using Postgres 9.0.3 in production with Slony 2.0.6 for replication.
I am currently testing the streaming replication with 9.0.3. My database contains 100 GBytes of data (6000 tables).
I have 1 master node and 3 slave nodes. The 1st slave node is in the same datacenter as the master node. The 2 other slave nodes are in a 2nd distant datacenter.
Say node 1 is the master node, node 2 is the local slave node, and nodes 3 and 4 are the slave nodes in the distant datacenter.
After installation of the streaming replication, I have checked that all the files have been copied from the master to the slaves and that files have the same size and modification date on each node.
When I send update requests, the streaming replication works perfectly; I am able to replicate many thousands of updates per second.
I have updated, inserted and deleted nearly 70% of the rows in all the tables (mainly update requests), and replication is working perfectly.
However; when I want to promote node 3 as the master, I need to rsync all the logs and databases from node 3 to node 4, node 1 and node 2.
Even if I use rsync -a (as suggested in the wiki http://wiki.postgresql.org/wiki/Streaming_Replication), it takes a long time (nearly 2 hours for a distant slave).
It looks that all database files do not have the same modification date in the master node and in the slave nodes, so the rsync copies quite all the database from the new master to the slaves.
At first, I was suspecting vacuum process for modifying the files not simultaneously in the master and the slaves. When I check the dates last autoanalyze and last_autovacuum fields in pg_stat_user_tables, they are diffent than the mtime of the files; so it looks autovacuum is not responsible of that.
Please, could you help me to understand why it is so long to rsync the data from the new master to the other slaves ?
Did I miss anything ?
Any idea ?
Jean-Armel
I am using Postgres 9.0.3 in production with Slony 2.0.6 for replication.
I am currently testing the streaming replication with 9.0.3. My database contains 100 GBytes of data (6000 tables).
I have 1 master node and 3 slave nodes. The 1st slave node is in the same datacenter as the master node. The 2 other slave nodes are in a 2nd distant datacenter.
Say node 1 is the master node, node 2 is the local slave node, and nodes 3 and 4 are the slave nodes in the distant datacenter.
After installation of the streaming replication, I have checked that all the files have been copied from the master to the slaves and that files have the same size and modification date on each node.
When I send update requests, the streaming replication works perfectly; I am able to replicate many thousands of updates per second.
I have updated, inserted and deleted nearly 70% of the rows in all the tables (mainly update requests), and replication is working perfectly.
However; when I want to promote node 3 as the master, I need to rsync all the logs and databases from node 3 to node 4, node 1 and node 2.
Even if I use rsync -a (as suggested in the wiki http://wiki.postgresql.org/wiki/Streaming_Replication), it takes a long time (nearly 2 hours for a distant slave).
It looks that all database files do not have the same modification date in the master node and in the slave nodes, so the rsync copies quite all the database from the new master to the slaves.
At first, I was suspecting vacuum process for modifying the files not simultaneously in the master and the slaves. When I check the dates last autoanalyze and last_autovacuum fields in pg_stat_user_tables, they are diffent than the mtime of the files; so it looks autovacuum is not responsible of that.
Please, could you help me to understand why it is so long to rsync the data from the new master to the other slaves ?
Did I miss anything ?
Any idea ?
Jean-Armel
Jean-Armel Luce wrote: > Please, could you help me to understand why it is so long to rsync > the data from the new master to the other slaves ? Your post is a little light on details. I think the most useful information would be the output of: rysnc --version and the exact rsync command you are using. If you are using any sort of remote mount point, that would be important to know. -Kevin
Jean-Armel Luce <jaluce06@gmail.com> writes: > Hi all, > > I am using Postgres 9.0.3 in production with Slony 2.0.6 for replication. > I am currently testing the streaming replication with 9.0.3.? My database contains 100 GBytes of data (6000 tables). > > I have 1 master node and 3 slave nodes. The 1st slave node is in the same datacenter as the master node. The 2 other > slave nodes are in a 2nd distant datacenter. > > Say node 1 is the master node, node 2 is the local slave node, and nodes 3 and 4 are the slave nodes in the distant > datacenter. > > After installation of the streaming replication, I have checked that all the files have been copied from the master to > the slaves and that files have the same size and modification date on each node. > When I send update requests, the streaming replication works perfectly; I am able to replicate many thousands of > updates per second. > I have updated, inserted and deleted nearly 70% of the rows in all the tables (mainly update requests), and > replication is working perfectly. > > However; when I want to promote node 3 as the master, I need to rsync all the logs and databases from node 3 to node > 4, node 1 and node 2. That is a popular misconception. Assuming you've got your WAL repositories well organized and your standbys are configure to follow latest timeline... You should be able to repoint the existing standbys and even the demoted master just by creating or changing recovery.conf as needed and restarting all standbys. I did a talk about that at Pg-East 2010 based on Pg 8.4. It all still seems to work now with streaming replication and hot-standby as well. HTH > Even if I use rsync -a (as suggested in the wiki http://wiki.postgresql.org/wiki/Streaming_Replication), it takes a > long time (nearly 2 hours for a distant slave). > It looks that all database files do not have the same modification date in the master node and in the slave nodes, so > the rsync copies quite all the database from the new master to the slaves. > > At first, I was suspecting vacuum process for modifying the files not simultaneously in the master and the slaves. > When I check the dates? last autoanalyze and last_autovacuum fields in pg_stat_user_tables, they are diffent than the > mtime of the files; so it looks autovacuum is not responsible of that. > > Please, could you help me to understand why it is so long to rsync the data from the new master to the other slaves ? > Did I miss anything ? > Any idea ? > > Jean-Armel > -- Jerry Sievers Postgres DBA/Development Consulting e: postgres.consulting@comcast.net p: 305.321.1144
Hi Jerry and Kevin,
Thanks for your answers.
Jerry, I tried as you said with the parameter recovery_target_timeline = 'latest' and it works.
I tried on a smaller test database (only 15MB) with PG9.1.1 and only 1 slave.
My switchover procedure was :
Step 1 : stop the old master
/usr/local/pgsql/bin/pg_ctl stop -m immediate -D /usr/local/pgsql91/server1/data
Step 2 : promote slave as master :
touch /usr/local/pgsql91/server2/data/trigger_file
Step 3 : declare the old master as a standby server
Step 3.1 : vi /usr/local/pgsql91/server1/data/postgresql.conf
Add hot_standby = on in the postgresql.conf
Step 3.2 Set recovery.conf for old master server (including recovery_target_timeline = 'latest')
cp /usr/local/pgsql91/server1/data/recovery.bkp /usr/local/pgsql91/server1/data/recovery.conf
Step 4 : start old master
/usr/local/pgsql/bin/pg_ctl start -D /usr/local/pgsql91/server1/data &
The old master is now a hot_standby of the new master. Replication works without rsyncing all data from new master to new slave.
Tomorrow, I shall try with PG9.0.3, 3 slaves and a primary database with 100 GB.
Thanks.
Jal
Thanks for your answers.
Jerry, I tried as you said with the parameter recovery_target_timeline = 'latest' and it works.
I tried on a smaller test database (only 15MB) with PG9.1.1 and only 1 slave.
My switchover procedure was :
Step 1 : stop the old master
/usr/local/pgsql/bin/pg_ctl stop -m immediate -D /usr/local/pgsql91/server1/data
Step 2 : promote slave as master :
touch /usr/local/pgsql91/server2/data/trigger_file
Step 3 : declare the old master as a standby server
Step 3.1 : vi /usr/local/pgsql91/server1/data/postgresql.conf
Add hot_standby = on in the postgresql.conf
Step 3.2 Set recovery.conf for old master server (including recovery_target_timeline = 'latest')
cp /usr/local/pgsql91/server1/data/recovery.bkp /usr/local/pgsql91/server1/data/recovery.conf
Step 4 : start old master
/usr/local/pgsql/bin/pg_ctl start -D /usr/local/pgsql91/server1/data &
The old master is now a hot_standby of the new master. Replication works without rsyncing all data from new master to new slave.
Tomorrow, I shall try with PG9.0.3, 3 slaves and a primary database with 100 GB.
Thanks.
Jal
2011/11/13 Jean-Armel Luce <jaluce06@gmail.com>: > Hi Jerry and Kevin, > > Thanks for your answers. > > Jerry, I tried as you said with the parameter recovery_target_timeline = > 'latest' and it works. > > I tried on a smaller test database (only 15MB) with PG9.1.1 and only 1 > slave. > > My switchover procedure was : > > Step 1 : stop the old master > /usr/local/pgsql/bin/pg_ctl stop -m immediate -D > /usr/local/pgsql91/server1/data > > Step 2 : promote slave as master : > touch /usr/local/pgsql91/server2/data/trigger_file > > Step 3 : declare the old master as a standby server > Step 3.1 : vi /usr/local/pgsql91/server1/data/postgresql.conf > Add hot_standby = on in the postgresql.conf > > Step 3.2 Set recovery.conf for old master server (including > recovery_target_timeline = 'latest') > cp /usr/local/pgsql91/server1/data/recovery.bkp > /usr/local/pgsql91/server1/data/recovery.conf > > Step 4 : start old master > /usr/local/pgsql/bin/pg_ctl start -D /usr/local/pgsql91/server1/data & > > > The old master is now a hot_standby of the new master. Replication works > without rsyncing all data from new master to new slave. > > > Tomorrow, I shall try with PG9.0.3, 3 slaves and a primary database with 100 > GB. > > Thanks. > > Jal just for the value : rsync --checksum is the option to use to prevent copying of identical files (it computes checksum on both side before sending) -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
Hi,
I tried many times with different options of rsync :
with rsync -a as explained in the wiki (http://www.postgresql.org/docs/9.0/interactive/warm-standby.html#STANDBY-SERVER-SETUP), it takes 1h40 for each distant slave and quite all data files are transferred
with -c (checksum) or -z (compress) , it takes more time, probably due to checksum time or compress/uncompress time
For example, the rsync commands I am using are for data and logs are (I have 2 databases, 1 tablespace per database):
rsync -a /var/opt/hosting/db/slony/pg_xlog/* slave-db01.profiles.bench.pns-si.s1.p.fti.net:/var/opt/hosting/db/slony/pg_xlog/
rsync -a /var/opt/hosting/db/profiles/bench/* slave-db01.profiles.bench.pns-si.s1.p.fti.net:/var/opt/hosting/db/profiles/bench/
rsync -a /var/opt/hosting/db/profiles/profiles/* slave-db01.profiles.bench.pns-si.s1.p.fti.net:/var/opt/hosting/db/profiles/profiles/
More information about version of rsync :
jaluce@master-db01:~$ rsync --version
rsync version 3.0.7 protocol version 30
Jal
I tried many times with different options of rsync :
with rsync -a as explained in the wiki (http://www.postgresql.org/docs/9.0/interactive/warm-standby.html#STANDBY-SERVER-SETUP), it takes 1h40 for each distant slave and quite all data files are transferred
with -c (checksum) or -z (compress) , it takes more time, probably due to checksum time or compress/uncompress time
For example, the rsync commands I am using are for data and logs are (I have 2 databases, 1 tablespace per database):
rsync -a /var/opt/hosting/db/slony/pg_xlog/* slave-db01.profiles.bench.pns-si.s1.p.fti.net:/var/opt/hosting/db/slony/pg_xlog/
rsync -a /var/opt/hosting/db/profiles/bench/* slave-db01.profiles.bench.pns-si.s1.p.fti.net:/var/opt/hosting/db/profiles/bench/
rsync -a /var/opt/hosting/db/profiles/profiles/* slave-db01.profiles.bench.pns-si.s1.p.fti.net:/var/opt/hosting/db/profiles/profiles/
More information about version of rsync :
jaluce@master-db01:~$ rsync --version
rsync version 3.0.7 protocol version 30
Jal
2011/11/14 Cédric Villemain <cedric.villemain.debian@gmail.com>
2011/11/13 Jean-Armel Luce <jaluce06@gmail.com>:just for the value : rsync --checksum is the option to use to prevent> Hi Jerry and Kevin,
>
> Thanks for your answers.
>
> Jerry, I tried as you said with the parameter recovery_target_timeline =
> 'latest' and it works.
>
> I tried on a smaller test database (only 15MB) with PG9.1.1 and only 1
> slave.
>
> My switchover procedure was :
>
> Step 1 : stop the old master
> /usr/local/pgsql/bin/pg_ctl stop -m immediate -D
> /usr/local/pgsql91/server1/data
>
> Step 2 : promote slave as master :
> touch /usr/local/pgsql91/server2/data/trigger_file
>
> Step 3 : declare the old master as a standby server
> Step 3.1 : vi /usr/local/pgsql91/server1/data/postgresql.conf
> Add hot_standby = on in the postgresql.conf
>
> Step 3.2 Set recovery.conf for old master server (including
> recovery_target_timeline = 'latest')
> cp /usr/local/pgsql91/server1/data/recovery.bkp
> /usr/local/pgsql91/server1/data/recovery.conf
>
> Step 4 : start old master
> /usr/local/pgsql/bin/pg_ctl start -D /usr/local/pgsql91/server1/data &
>
>
> The old master is now a hot_standby of the new master. Replication works
> without rsyncing all data from new master to new slave.
>
>
> Tomorrow, I shall try with PG9.0.3, 3 slaves and a primary database with 100
> GB.
>
> Thanks.
>
> Jal
copying of identical files (it computes checksum on both side before
sending)
--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation
On Nov 14, 2011, at 10:59 AM, Jean-Armel Luce wrote: > just for the value : rsync --checksum is the option to use to prevent > copying of identical files No, that's not what it's used for. It already avoids sending identical blocks by using checksums. --checksum forces a checksumon files that have identical sizes & mod times, thus catching files that have different contents despite having thesame mod times & sizes. -- Scott Ribe scott_ribe@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice
I think there have been two similar threads recently, and I want to be sure I'm not confusing them. So: - How large is the db? By which I mean how much disk space does the data directory occupy? - What's the bandwidth of the network connection to the distant slave? - What's the CPU & disk on each end? On Nov 14, 2011, at 10:59 AM, Jean-Armel Luce wrote: > For example, the rsync commands I am using are for data and logs are (I have 2 databases, 1 tablespace per database): > rsync -a /var/opt/hosting/db/slony/pg_xlog/* slave-db01.profiles.bench.pns-si.s1.p.fti.net:/var/opt/hosting/db/slony/pg_xlog/ > rsync -a /var/opt/hosting/db/profiles/bench/* slave-db01.profiles.bench.pns-si.s1.p.fti.net:/var/opt/hosting/db/profiles/bench/ > rsync -a /var/opt/hosting/db/profiles/profiles/* slave-db01.profiles.bench.pns-si.s1.p.fti.net:/var/opt/hosting/db/profiles/profiles/ Well, there's one error. Your command is rsync'ing each file individually, so of course each file is sync'd. Sync the directoriesinstead--in other words leave off the * (but not the /) and let rsync decide which files need sync'ing. -- Scott Ribe scott_ribe@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice
2011/11/14 Scott Ribe <scott_ribe@elevated-dev.com>: > On Nov 14, 2011, at 10:59 AM, Jean-Armel Luce wrote: > >> just for the value : rsync --checksum is the option to use to prevent >> copying of identical files > > No, that's not what it's used for. It already avoids sending identical blocks by using checksums. --checksum forces a checksumon files that have identical sizes & mod times, thus catching files that have different contents despite having thesame mod times & sizes. no, you are wrong. -c, --checksum "This changes the way rsync checks if the files have been changed and are in need of a transfer. Without this option, rsync uses a "quick check" that (by default) checks if each file's size and time of last modification match between the sender and receiver. This option changes this to compare a 128-bit checksum for each file that has a matching size. Generating the checksums means that both sides will expend a lot of disk I/O reading all the data in the files in the transfer (and this is prior to any reading that will be done to transfer changed files), so this can slow things down significantly. " ... -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
On Nov 15, 2011, at 3:02 AM, Cédric Villemain wrote: > no, you are wrong. > -c, --checksum > "This changes the way rsync checks if the files have been changed and > are in need of a transfer. Without this option, rsync uses a "quick > check" that (by default) checks if each file's size and time of last > modification match between the sender and receiver. This option > changes this to compare a 128-bit checksum for each file that has a > matching size. Generating the checksums means that both sides will > expend a lot of disk I/O reading all the data in the files in the > transfer (and this is prior to any reading that will be done to > transfer changed files), so this can slow things down significantly. " Seriously, read that and what I said. They are the same, except that the documentation provides more detail. -- Scott Ribe scott_ribe@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice
2011/11/15 Scott Ribe <scott_ribe@elevated-dev.com>: > On Nov 15, 2011, at 3:02 AM, Cédric Villemain wrote: > >> no, you are wrong. >> -c, --checksum >> "This changes the way rsync checks if the files have been changed and >> are in need of a transfer. Without this option, rsync uses a "quick >> check" that (by default) checks if each file's size and time of last >> modification match between the sender and receiver. This option >> changes this to compare a 128-bit checksum for each file that has a >> matching size. Generating the checksums means that both sides will >> expend a lot of disk I/O reading all the data in the files in the >> transfer (and this is prior to any reading that will be done to >> transfer changed files), so this can slow things down significantly. " > > Seriously, read that and what I said. They are the same, except that the documentation provides more detail. Seriously, I did. Is my post "just for the value : rsync --checksum is the option to use to prevent copying of **identical files**" incorrect ? OP contains "It looks that all database files do not have the same modification date in the master node and in the slave nodes, so the rsync copies quite all the database from the new master to the slaves." One benefit is when files are in fact identical on both side, so that rsync does not have to process checksum for each blocks on source and destination. (when there are few changes, we expect rsync to copy only those few changes, with or without --checksum). -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
On Nov 15, 2011, at 8:30 AM, Cédric Villemain wrote: > Seriously, I did. Is my post "just for the value : rsync --checksum is > the option to use to prevent copying of **identical files**" incorrect > ? It's at least incomplete and somewhat misleading. But I guess you could say the same about my post; we seem to be focusingon 2 different aspects of its behavior ;-) > OP contains "It looks that all database files do not have the same > modification date in the master node and in the slave nodes, so the > rsync copies quite all the database from the new master to the > slaves." Yes, and rsync should only be copying changed blocks in that case, of which you are aware, but which OP did not seem to realize. > One benefit is when files are in fact identical on both side, so that > rsync does not have to process checksum for each blocks on source and > destination. (when there are few changes, we expect rsync to copy only > those few changes, with or without --checksum). Well, but it does calculate checksums on the entire contents of both files (which takes as much I/O and about as much CPUas calculating checksums for each block), even when timestamps & sizes are identical. For the OP's case, identical files with differing timestamps, the only potential savings is from not exchanging checksumsover the network, which is not likely to offer any meaningful improvement in performance, which still leaves openthe question as to why rsync is so slow in that, when we know it is usually relatively fast to sync two servers withfew differences. Would be nice to actually hear from OP regarding file sizes/counts & network bandwidth & disks & and so on ;-) -- Scott Ribe scott_ribe@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice
Sorry to be so long to answer :-(
2011/11/14 Scott Ribe <scott_ribe@elevated-dev.com>
- How large is the db? By which I mean how much disk space does the data directory occupy?
20 GB
- What's the bandwidth of the network connection to the distant slave?
during rsync, I see narly 125Mbits/sec
- What's the CPU & disk on each end?
disk : 6 * 146GB 15Krpm RAID10
CPU : 16 cores Xeon(R) CPU L5630 @ 2.13GHz cache size : 12288 KB
Only Postgres is running on my servers (no HTTP server, nothing else, ...). CPU usage is very low.
This afternoon, I have again sent some updates requests, which were replicated to the sslaves.
:
- I am looking modification of modification dates and checksums of 2 tables among my 6000 tables :
For each file, the checksum is the same on all the slaves, but different from the checksum of the master.
For each file, the modification time is different on each node. (see below)
So if I want to promote one slave as the master, it will not need to copy data from the new master to the previous slaves with rsync, but it will copy all the files from the new master to the old master (which is now a slave).
I shall try tomorrow topromote again a slave, and I shall rsync withh --checksum.
I don't think that it is -a is very useful withh --checksum (no need to preserve modification times). Do you agree ?
On the master :
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867807
-rw------- 1 postgres postgres 5914624 Nov 15 14:31 107867807
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867807
23c28c6432c073d370e7e9624fd04e3b 107867807
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867867
-rw------- 1 postgres postgres 5996544 Nov 15 14:31 107867867
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867867
1f3398e18e22bfeb31ca2db82d8517f2 107867867
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
On slave 1 :
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867807
-rw------- 1 postgres postgres 5914624 Nov 15 14:16 107867807
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867807
f79d93ca00ac400f709a028c597ca4ab 107867807
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867867
-rw------- 1 postgres postgres 5996544 Nov 15 14:18 107867867
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867867
3d5534f855168378f5cb2e30d622c0aa 107867867
On slave 2 :
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867807
-rw------- 1 postgres postgres 5914624 Nov 15 14:24 107867807
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867807
f79d93ca00ac400f709a028c597ca4ab 107867807
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867867
-rw------- 1 postgres postgres 5996544 Nov 15 14:25 107867867
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867867
3d5534f855168378f5cb2e30d622c0aa 107867867
On slave 3 :
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867807
-rw------- 1 postgres postgres 5914624 Nov 15 14:16 107867807
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867807
f79d93ca00ac400f709a028c597ca4ab 107867807
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867867
-rw------- 1 postgres postgres 5996544 Nov 15 14:18 107867867
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867867
3d5534f855168378f5cb2e30d622c0aa 107867867
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
Jal
2011/11/14 Scott Ribe <scott_ribe@elevated-dev.com>
- How large is the db? By which I mean how much disk space does the data directory occupy?
20 GB
- What's the bandwidth of the network connection to the distant slave?
during rsync, I see narly 125Mbits/sec
- What's the CPU & disk on each end?
disk : 6 * 146GB 15Krpm RAID10
CPU : 16 cores Xeon(R) CPU L5630 @ 2.13GHz cache size : 12288 KB
Only Postgres is running on my servers (no HTTP server, nothing else, ...). CPU usage is very low.
This afternoon, I have again sent some updates requests, which were replicated to the sslaves.
:
- I am looking modification of modification dates and checksums of 2 tables among my 6000 tables :
For each file, the checksum is the same on all the slaves, but different from the checksum of the master.
For each file, the modification time is different on each node. (see below)
So if I want to promote one slave as the master, it will not need to copy data from the new master to the previous slaves with rsync, but it will copy all the files from the new master to the old master (which is now a slave).
I shall try tomorrow topromote again a slave, and I shall rsync withh --checksum.
I don't think that it is -a is very useful withh --checksum (no need to preserve modification times). Do you agree ?
On the master :
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867807
-rw------- 1 postgres postgres 5914624 Nov 15 14:31 107867807
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867807
23c28c6432c073d370e7e9624fd04e3b 107867807
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867867
-rw------- 1 postgres postgres 5996544 Nov 15 14:31 107867867
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867867
1f3398e18e22bfeb31ca2db82d8517f2 107867867
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
On slave 1 :
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867807
-rw------- 1 postgres postgres 5914624 Nov 15 14:16 107867807
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867807
f79d93ca00ac400f709a028c597ca4ab 107867807
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867867
-rw------- 1 postgres postgres 5996544 Nov 15 14:18 107867867
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867867
3d5534f855168378f5cb2e30d622c0aa 107867867
On slave 2 :
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867807
-rw------- 1 postgres postgres 5914624 Nov 15 14:24 107867807
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867807
f79d93ca00ac400f709a028c597ca4ab 107867807
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867867
-rw------- 1 postgres postgres 5996544 Nov 15 14:25 107867867
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867867
3d5534f855168378f5cb2e30d622c0aa 107867867
On slave 3 :
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867807
-rw------- 1 postgres postgres 5914624 Nov 15 14:16 107867807
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867807
f79d93ca00ac400f709a028c597ca4ab 107867807
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867867
-rw------- 1 postgres postgres 5996544 Nov 15 14:18 107867867
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867867
3d5534f855168378f5cb2e30d622c0aa 107867867
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
Jal
--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Humm sorry, I did a mistake
2011/11/15 Jean-Armel Luce <jaluce06@gmail.com>
Sorry to be so long to answer :-(- How large is the db? By which I mean how much disk space does the data directory occupy?20 GB
100 GB
2011/11/14 Scott Ribe <scott_ribe@elevated-dev.com>
- How large is the db? By which I mean how much disk space does the data directory occupy?during rsync, I see narly 125Mbits/sec
- What's the bandwidth of the network connection to the distant slave?disk : 6 * 146GB 15Krpm RAID10
- What's the CPU & disk on each end?
CPU : 16 cores Xeon(R) CPU L5630 @ 2.13GHz cache size : 12288 KB
Only Postgres is running on my servers (no HTTP server, nothing else, ...). CPU usage is very low.
This afternoon, I have again sent some updates requests, which were replicated to the sslaves.
:
- I am looking modification of modification dates and checksums of 2 tables among my 6000 tables :
For each file, the checksum is the same on all the slaves, but different from the checksum of the master.
For each file, the modification time is different on each node. (see below)
So if I want to promote one slave as the master, it will not need to copy data from the new master to the previous slaves with rsync, but it will copy all the files from the new master to the old master (which is now a slave).
I shall try tomorrow topromote again a slave, and I shall rsync withh --checksum.
I don't think that it is -a is very useful withh --checksum (no need to preserve modification times). Do you agree ?
On the master :
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867807
-rw------- 1 postgres postgres 5914624 Nov 15 14:31 107867807
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867807
23c28c6432c073d370e7e9624fd04e3b 107867807
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867867
-rw------- 1 postgres postgres 5996544 Nov 15 14:31 107867867
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867867
1f3398e18e22bfeb31ca2db82d8517f2 107867867
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
On slave 1 :
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867807
-rw------- 1 postgres postgres 5914624 Nov 15 14:16 107867807
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867807
f79d93ca00ac400f709a028c597ca4ab 107867807
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867867
-rw------- 1 postgres postgres 5996544 Nov 15 14:18 107867867
postgres@master-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867867
3d5534f855168378f5cb2e30d622c0aa 107867867
On slave 2 :
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867807
-rw------- 1 postgres postgres 5914624 Nov 15 14:24 107867807
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867807
f79d93ca00ac400f709a028c597ca4ab 107867807
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867867
-rw------- 1 postgres postgres 5996544 Nov 15 14:25 107867867
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867867
3d5534f855168378f5cb2e30d622c0aa 107867867
On slave 3 :
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867807
-rw------- 1 postgres postgres 5914624 Nov 15 14:16 107867807
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867807
f79d93ca00ac400f709a028c597ca4ab 107867807
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ ls -l 107867867
-rw------- 1 postgres postgres 5996544 Nov 15 14:18 107867867
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$ md5sum 107867867
3d5534f855168378f5cb2e30d622c0aa 107867867
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
postgres@slave-db01:/var/opt/hosting/db/profiles/bench/PG_9.0_201008051/29797$
Jal--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
> This afternoon, I have again sent some updates requests, which were > replicated to the sslaves. > : > - I am looking modification of modification dates and checksums of 2 tables > among my 6000 tables : > For each file, the checksum is the same on all the slaves, but different > from the checksum of the master. > For each file, the modification time is different on each node. (see below) you are probably hit by "hint bits": they are not WAL-logged with 9.0 so the files can be different just because of "select" you issued on master and/or standby. > > So if I want to promote one slave as the master, it will not need to copy > data from the new master to the previous slaves with rsync, but it will copy > all the files from the new master to the old master (which is now a slave). > > I shall try tomorrow topromote again a slave, and I shall rsync withh > --checksum. > I don't think that it is -a is very useful withh --checksum (no need to > preserve modification times). Do you agree ? -a is a good shortcut, chaging the modtime is not a real cost. So despite you don't need to keep the mtime, there is no benefit in not keeping it :) Well, after re-reading rsync manual, and taking into account Scott answers: use --ignore-time will make all files rsynced (thus it will check each block and copy only the blocks which differ) use --checksum will make all files to be read and checksumed in both side before trying to rsynced them (and check each block and copy them if required). Obvisouly when the files do not have the same size, they are rsynced without a 'global' checksum. It is safer (someone can say paranoid, which is correct) to use one of those in the PostgreSQL case where we have a size limit and where files can be modified in both side without affecting their size. So there is a hight risk to have the same size on source and destination and a very low risk to have the same modification time when the content is changed. I admit the risk is very low and in practice it should not happen. As many things should not happen... If you want to reduce the re-rsync step, you may want to try to have similar files in both places by using vacuum freeze before initial rsync, or something like that (so hint bits are set before rsync). -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
Hi,
Today I tried to promote a slave as master using rsync --checksum (without doing vacuum freeze) instead of rsync --all.
It takes only 30 minutes with rsync --checksum. Only a few tables are rsynced. Most of the time is consumed by checksum.
With rsync --all, it takes 1h40 min.
So, rsync --checksum looks better than rsync --all
Jal
Today I tried to promote a slave as master using rsync --checksum (without doing vacuum freeze) instead of rsync --all.
It takes only 30 minutes with rsync --checksum. Only a few tables are rsynced. Most of the time is consumed by checksum.
With rsync --all, it takes 1h40 min.
So, rsync --checksum looks better than rsync --all
Jal
Jean-Armel Luce <jaluce06@gmail.com> wrote: > So, rsync --checksum looks better than rsync --all I've never heard of an --all option for rsync. What does that do? -Kevin
On Nov 16, 2011, at 9:37 AM, Jean-Armel Luce wrote: > So, rsync --checksum looks better than rsync --all --all??? What the heck is that and why were you using it? -- Scott Ribe scott_ribe@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice
2011/11/16 Jean-Armel Luce <jaluce06@gmail.com>: > Hi, > > Today I tried to promote a slave as master using rsync --checksum (without > doing vacuum freeze) instead of rsync --all. > > It takes only 30 minutes with rsync --checksum. Only a few tables are > rsynced. Most of the time is consumed by checksum. > With rsync --all, it takes 1h40 min. > > So, rsync --checksum looks better than rsync --all rsync --all does not exist, you mean --archive I bet (the longopt for -a, a shorthand for the options below plus -Dt for device/special and time). Also you probably want to keep --owner --group --perms --recursive (or you do file by file with a find or something like that ?) -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
You are right. I used -a, and I was wanting to be more meaningful so I wrote --all in my post.
Please read --archive insterad of --all
I kept --recursive.
I didn't use --owner , --group, --perms (permissions, group and owner are the same on each side).
I rsynced all the directory (without /*) as Scott explained.
Sorry for the mistake.
Jal
Please read --archive insterad of --all
I kept --recursive.
I didn't use --owner , --group, --perms (permissions, group and owner are the same on each side).
I rsynced all the directory (without /*) as Scott explained.
Sorry for the mistake.
Jal
On Nov 16, 2011, at 10:11 AM, Jean-Armel Luce wrote: > You are right. I used -a, and I was wanting to be more meaningful so I wrote --all in my post. > Please read --archive insterad of --all Oh, OK. Still seems odd that it took so much longer. Granted, for the files with different timestamps but identical contents,it then syncs them. But it does so by checksumming blocks, comparing checksums, and sending only blocks that aredifferent over the network. Granted, it has to send some checksums over the network, but that's pretty minor traffic.I believe you said you'd seen 125Mb/s over the network? Is that actually accurate? Does the network connection havehigh latency? Also, I believe you said -z seemed to slow it down? That has not been my experience at all with rsync'ing pg databases. Betweenall the values that are stored as plain text, and the redundancies in indexes, I usually see a good speed increasefrom compressing the data in transit. I'm certainly glad that you've got a 3x speed increase--that's significant progress. But still, something seems odd aboutthe performance you've reported, so I'm left wondering about what could cause that. Disk performance glitch at one endor the other, network performance, CPU load??? One thing worth doing I think is to use --stats on every test, so you can see every time how many files and how much datais actually transferred. Also, if you're sitting there watching, sometimes --progress can be informative... -- Scott Ribe scott_ribe@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice
2011/11/16 Scott Ribe <scott_ribe@elevated-dev.com>: > On Nov 16, 2011, at 10:11 AM, Jean-Armel Luce wrote: > >> You are right. I used -a, and I was wanting to be more meaningful so I wrote --all in my post. >> Please read --archive insterad of --all > > Oh, OK. Still seems odd that it took so much longer. Granted, for the files with different timestamps but identical contents,it then syncs them. But it does so by checksumming blocks, comparing checksums, and sending only blocks that aredifferent over the network. Granted, it has to send some checksums over the network, but that's pretty minor traffic.I believe you said you'd seen 125Mb/s over the network? Is that actually accurate? Does the network connection havehigh latency? > > Also, I believe you said -z seemed to slow it down? That has not been my experience at all with rsync'ing pg databases.Between all the values that are stored as plain text, and the redundancies in indexes, I usually see a good speedincrease from compressing the data in transit. > > I'm certainly glad that you've got a 3x speed increase--that's significant progress. But still, something seems odd aboutthe performance you've reported, so I'm left wondering about what could cause that. Disk performance glitch at one endor the other, network performance, CPU load??? checksum are calculated for 1024 bytes(maybe 2048) per default, for 100GB that makes a huge number of checksum calculated for nothing. It is possible to increase the block-size (to 8kb for example) for the checksum but it also increase the risk of false positive (I am not aware if it is possible to provide the checksum algorithm and size expected to rsync, or if there can be md5sum collision on 8Kb data). > > One thing worth doing I think is to use --stats on every test, so you can see every time how many files and how much datais actually transferred. Also, if you're sitting there watching, sometimes --progress can be informative... > > -- > Scott Ribe > scott_ribe@elevated-dev.com > http://www.elevated-dev.com/ > (303) 722-0567 voice > > > > > -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
2011/11/16 Scott Ribe <scott_ribe@elevated-dev.com>:
> Does the network connection have high latency?
When I ping the server,time=15ms.
> Also, I believe you said -z seemed to slow it down?
I confirm
>
> I'm certainly glad that you've got a 3x speed increase--
Me too :-) . Thanks for your advices. Thanks to Cédric too
;
;
But still, something seems odd about the performance you've reported, so I'm left wondering about what could cause that. Disk performance glitch at one end or the other, network performance, CPU load???
CPU usage is very low
Today, I tried with command : rsync -rc --block-size=8192
Rsync duration with blksize=8192 was 31 minutes (yesterday without blocksize, duration was 32 minutes exactly)
Thanks for your help
2011/11/17 Jean-Armel Luce <jaluce06@gmail.com>: > > 2011/11/16 Scott Ribe <scott_ribe@elevated-dev.com>: >> >> > Does the network connection have high latency? > > When I ping the server,time=15ms. > >> > Also, I believe you said -z seemed to slow it down? > > I confirm > >> > >> > I'm certainly glad that you've got a 3x speed increase-- > > Me too :-) . Thanks for your advices. Thanks to Cédric too you're welcome. > ; >> >> But still, something seems odd about the performance you've reported, so >> I'm left wondering about what could cause that. Disk performance glitch at >> one end or the other, network performance, CPU load??? > > CPU usage is very low strange (added to the -z making things worse). there should be something behind as supposed Scott. Virtual host ? Firewall ? something not common ? > > Today, I tried with command : rsync -rc --block-size=8192 > Rsync duration with blksize=8192 was 31 minutes (yesterday without > blocksize, duration was 32 minutes exactly) idea behind the blocksize changes was also to reduce the number of checksum comparaison and alignement (by 8) and compare with the rsync with --ignore-time instead of --checksum That should give a good idea on the extra cost of the block checksum for files without changes (which are matched by --checksum). -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation
On Nov 17, 2011, at 10:11 AM, Jean-Armel Luce wrote: > > Also, I believe you said -z seemed to slow it down? > I confirm Is there a lot of your data that is already compressed? Either something like images which you are storing as bytea or largeobjects? Or long text values that pg will itself compress out of line? > Today, I tried with command : rsync -rc --block-size=8192 > Rsync duration with blksize=8192 was 31 minutes (yesterday without blocksize, duration was 32 minutes exactly) That seems not too bad, in that you're sustaining >3GB/min of reading & checksumming. (Some of my earlier comments were basedon the belief that your db was 20GB.) -- Scott Ribe scott_ribe@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice
Hi,
I did some benchs as suggested by Cedric.
At first, I rsynced with options rsync -r --ignore-times --stats -h, and then I rsynced with options rsync -rc --stats -h
The total duration of 1st rsync (--ignore-times) for all my tablespaces is 24 min 24 sec.
Please find below details printed by rsync for 1 tablespace (size 107 Go found using du -h) ;
For this tablespace, duration was 17 minutes 20 seconds.
=====================================
Mon Nov 21 12:35:56 CET 2011
/var/opt/hosting/db/profiles/profiles/
Number of files: 22468
Number of files transferred: 22464
Total file size: 114.04G bytes
Total transferred file size: 114.04G bytes
Literal data: 243.47K bytes
Matched data: 114.04G bytes
File list size: 416.00K
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 74.45M
Total bytes received: 116.52M
sent 74.45M bytes received 116.52M bytes 185.68K bytes/sec
total size is 114.04G speedup is 597.17
Mon Nov 21 12:53:04 CET 2011
=========================================
And duration for rsync -rc for all tablespaces is 32 minutes 2 seconds :
Duration for only this tablespace is 21 minutes 41 seconds.
========================================
Mon Nov 21 15:07:57 CET 2011
/var/opt/hosting/db/profiles/profiles/
Number of files: 22468
Number of files transferred: 11
Total file size: 114.04G bytes
Total transferred file size: 50.75M bytes
Literal data: 308.81K bytes
Matched data: 50.44M bytes
File list size: 775.13K
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 1.12M
Total bytes received: 55.08K
sent 1.12M bytes received 55.08K bytes 899.32 bytes/sec
total size is 114.04G speedup is 97434.05
Mon Nov 21 15:29:38 CET 2011
========================================
So, it looks that option --ignore-times is helpful.
Thanks for your comments.
Jal
I did some benchs as suggested by Cedric.
At first, I rsynced with options rsync -r --ignore-times --stats -h, and then I rsynced with options rsync -rc --stats -h
The total duration of 1st rsync (--ignore-times) for all my tablespaces is 24 min 24 sec.
Please find below details printed by rsync for 1 tablespace (size 107 Go found using du -h) ;
For this tablespace, duration was 17 minutes 20 seconds.
=====================================
Mon Nov 21 12:35:56 CET 2011
/var/opt/hosting/db/profiles/profiles/
Number of files: 22468
Number of files transferred: 22464
Total file size: 114.04G bytes
Total transferred file size: 114.04G bytes
Literal data: 243.47K bytes
Matched data: 114.04G bytes
File list size: 416.00K
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 74.45M
Total bytes received: 116.52M
sent 74.45M bytes received 116.52M bytes 185.68K bytes/sec
total size is 114.04G speedup is 597.17
Mon Nov 21 12:53:04 CET 2011
=========================================
And duration for rsync -rc for all tablespaces is 32 minutes 2 seconds :
Duration for only this tablespace is 21 minutes 41 seconds.
========================================
Mon Nov 21 15:07:57 CET 2011
/var/opt/hosting/db/profiles/profiles/
Number of files: 22468
Number of files transferred: 11
Total file size: 114.04G bytes
Total transferred file size: 50.75M bytes
Literal data: 308.81K bytes
Matched data: 50.44M bytes
File list size: 775.13K
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 1.12M
Total bytes received: 55.08K
sent 1.12M bytes received 55.08K bytes 899.32 bytes/sec
total size is 114.04G speedup is 97434.05
Mon Nov 21 15:29:38 CET 2011
========================================
So, it looks that option --ignore-times is helpful.
Thanks for your comments.
Jal