Thread: [GENERAL] Rsync to a recovering streaming replica?
Hello, I have a multi-terabyte streaming replica on a bysy database. When I set it up, repetative rsyncs take at least 6 hours each. So, when I start the replica, it begins streaming, but it is many hours behind right from the start. It is working for hours,and cannot reach a consistent state so the database is not getting opened for queries. I have plenty of WAL files available in the master’s pg_xlog, so the replicanever uses archived logs. A question: Should I be able to run one more rsync from the master to my replica while it is streaming? The idea is to overcome the throughput limit imposed by a single recovery process on the replica and allow to catch up quicker. I remember doing it many years ago on Pg 8.4, and also heard from other people doing it. In all cases, it seamed working. I’m just not sure if there is no high risk of introducing some hidden data corruption, which I may not notice for a whileon such a huge database. Any educated opinions on the subject here? Thank you Igor Polishchuk -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Sorry, here are the missing details, if it helps: Postgres 9.6.5 on CentOS 7.2.1511 > On Sep 27, 2017, at 10:56, Igor Polishchuk <ora4dba@gmail.com> wrote: > > Hello, > I have a multi-terabyte streaming replica on a bysy database. When I set it up, repetative rsyncs take at least 6 hourseach. > So, when I start the replica, it begins streaming, but it is many hours behind right from the start. It is working forhours, and cannot reach a consistent state > so the database is not getting opened for queries. I have plenty of WAL files available in the master’s pg_xlog, so thereplica never uses archived logs. > A question: > Should I be able to run one more rsync from the master to my replica while it is streaming? > The idea is to overcome the throughput limit imposed by a single recovery process on the replica and allow to catch upquicker. > I remember doing it many years ago on Pg 8.4, and also heard from other people doing it. In all cases, it seamed working. > I’m just not sure if there is no high risk of introducing some hidden data corruption, which I may not notice for a whileon such a huge database. > Any educated opinions on the subject here? > > Thank you > Igor Polishchuk -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
On Wed, Sep 27, 2017 at 1:59 PM, Igor Polishchuk <ora4dba@gmail.com> wrote:
Sorry, here are the missing details, if it helps:
Postgres 9.6.5 on CentOS 7.2.1511
> On Sep 27, 2017, at 10:56, Igor Polishchuk <ora4dba@gmail.com> wrote:
>
> Hello,
> I have a multi-terabyte streaming replica on a bysy database. When I set it up, repetative rsyncs take at least 6 hours each.
> So, when I start the replica, it begins streaming, but it is many hours behind right from the start. It is working for hours, and cannot reach a consistent state
> so the database is not getting opened for queries. I have plenty of WAL files available in the master’s pg_xlog, so the replica never uses archived logs.
> A question:
> Should I be able to run one more rsync from the master to my replica while it is streaming?
> The idea is to overcome the throughput limit imposed by a single recovery process on the replica and allow to catch up quicker.
> I remember doing it many years ago on Pg 8.4, and also heard from other people doing it. In all cases, it seamed working.
> I’m just not sure if there is no high risk of introducing some hidden data corruption, which I may not notice for a while on such a huge database.
> Any educated opinions on the subject here?
It really comes down to the amount of I/O (network and disk) your system can handle while under load. I've used 2 methods to do this in the past:
parsync (parallel rsync)is nice, it does all the hard work for you of parellizing rsync. It's just a pain to get all the prereqs installed.
- rsync --itemize-changes
Essentially, use this to get a list of files, manually split them out and fire up a number of rsyncs. parsync does this for you, but, if you can't get it going for any reason, this works.
The real trick, after you do your parallel rsync, make sure that you run one final rsync to sync-up any missed items.
Remember, it's all about I/O. The more parallel threads you use, the harder you'll beat up the disks / network on the master, which could impact production.
Good luck
--Scott
>
> Thank you
> Igor Polishchuk
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Scott,
Thank you for your insight. I do have some extra disk and network throughput to spare. However, my question is ‘Can I run rsync while streaming is running?’
A streaming replica is a physical copy of a master, so why not. My concern is a possible silent introduction of some block corruptions, that would not be fixed by a block copy in wal files. I think such corruptions should not happen, and I saw a few instances where running rsync seemed to work.
I’m curious if somebody is aware about a situation where a corruption is likely to happen.
Igor
On Sep 27, 2017, at 12:48, Scott Mead <scottm@openscg.com> wrote:On Wed, Sep 27, 2017 at 1:59 PM, Igor Polishchuk <ora4dba@gmail.com> wrote:Sorry, here are the missing details, if it helps:
Postgres 9.6.5 on CentOS 7.2.1511
> On Sep 27, 2017, at 10:56, Igor Polishchuk <ora4dba@gmail.com> wrote:
>
> Hello,
> I have a multi-terabyte streaming replica on a bysy database. When I set it up, repetative rsyncs take at least 6 hours each.
> So, when I start the replica, it begins streaming, but it is many hours behind right from the start. It is working for hours, and cannot reach a consistent state
> so the database is not getting opened for queries. I have plenty of WAL files available in the master’s pg_xlog, so the replica never uses archived logs.
> A question:
> Should I be able to run one more rsync from the master to my replica while it is streaming?
> The idea is to overcome the throughput limit imposed by a single recovery process on the replica and allow to catch up quicker.
> I remember doing it many years ago on Pg 8.4, and also heard from other people doing it. In all cases, it seamed working.
> I’m just not sure if there is no high risk of introducing some hidden data corruption, which I may not notice for a while on such a huge database.
> Any educated opinions on the subject here?It really comes down to the amount of I/O (network and disk) your system can handle while under load. I've used 2 methods to do this in the past:parsync (parallel rsync)is nice, it does all the hard work for you of parellizing rsync. It's just a pain to get all the prereqs installed.- rsync --itemize-changesEssentially, use this to get a list of files, manually split them out and fire up a number of rsyncs. parsync does this for you, but, if you can't get it going for any reason, this works.The real trick, after you do your parallel rsync, make sure that you run one final rsync to sync-up any missed items.Remember, it's all about I/O. The more parallel threads you use, the harder you'll beat up the disks / network on the master, which could impact production.Good luck--Scott>
> Thank you
> Igor Polishchuk
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general --
On Wed, Sep 27, 2017 at 4:08 PM, Igor Polishchuk <ora4dba@gmail.com> wrote:
Scott,Thank you for your insight. I do have some extra disk and network throughput to spare. However, my question is ‘Can I run rsync while streaming is running?’
Ahh, I see. Sorry
You need to stop the slave, put the master into backup mode, do the parallel rsync over the existing slave's data directory (for differential).
Then, pg_stop_backup(), start the slave.
--Scott
A streaming replica is a physical copy of a master, so why not. My concern is a possible silent introduction of some block corruptions, that would not be fixed by a block copy in wal files. I think such corruptions should not happen, and I saw a few instances where running rsync seemed to work.I’m curious if somebody is aware about a situation where a corruption is likely to happen.
IgorOn Sep 27, 2017, at 12:48, Scott Mead <scottm@openscg.com> wrote:On Wed, Sep 27, 2017 at 1:59 PM, Igor Polishchuk <ora4dba@gmail.com>wrote: Sorry, here are the missing details, if it helps:
Postgres 9.6.5 on CentOS 7.2.1511
> On Sep 27, 2017, at 10:56, Igor Polishchuk <ora4dba@gmail.com> wrote:
>
> Hello,
> I have a multi-terabyte streaming replica on a bysy database. When I set it up, repetative rsyncs take at least 6 hours each.
> So, when I start the replica, it begins streaming, but it is many hours behind right from the start. It is working for hours, and cannot reach a consistent state
> so the database is not getting opened for queries. I have plenty of WAL files available in the master’s pg_xlog, so the replica never uses archived logs.
> A question:
> Should I be able to run one more rsync from the master to my replica while it is streaming?
> The idea is to overcome the throughput limit imposed by a single recovery process on the replica and allow to catch up quicker.
> I remember doing it many years ago on Pg 8.4, and also heard from other people doing it. In all cases, it seamed working.
> I’m just not sure if there is no high risk of introducing some hidden data corruption, which I may not notice for a while on such a huge database.
> Any educated opinions on the subject here?It really comes down to the amount of I/O (network and disk) your system can handle while under load. I've used 2 methods to do this in the past:parsync (parallel rsync)is nice, it does all the hard work for you of parellizing rsync. It's just a pain to get all the prereqs installed.- rsync --itemize-changesEssentially, use this to get a list of files, manually split them out and fire up a number of rsyncs. parsync does this for you, but, if you can't get it going for any reason, this works.The real trick, after you do your parallel rsync, make sure that you run one final rsync to sync-up any missed items.Remember, it's all about I/O. The more parallel threads you use, the harder you'll beat up the disks / network on the master, which could impact production.Good luck--Scott>
> Thank you
> Igor Polishchuk
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general --