Thread: warm standby - apply wal archives
hello all, i would like your advice in the following matter. If i am not wrong, by implementing a warm standby (pg 8.4) the wal archives are being sent to the fail over server and when the time comes the fail over who already has a copy of the /data of the primary and all the wal archives, starts the recovery process by applying all these wals. and when it has finished, it goes up and is ready for connections. the question i have is the following. what happens if the wal archives are too many? how much could this procedure take? if someone has tested it and has some metrics i would really appreciate to see them. and more than that, is there a way to apply the wals every hour for example? so that when the time comes this procedure doesnt take too long? if i write a script that does the mentioned above, would that work? thx in advance -- View this message in context: http://postgresql.1045698.n5.nabble.com/warm-standby-apply-wal-archives-tp4770567p4770567.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
my bad... i read in the manual that the recovery process is constant and runs all the time. so the question now is how many wals can this procedure handle? for example can it handle 100-200G every day? if it cannot, any other suggestions for HA ?thx in advance -- View this message in context: http://postgresql.1045698.n5.nabble.com/warm-standby-apply-wal-archives-tp4770567p4771178.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
MirrorX <mirrorx@gmail.com> wrote: > my bad... > i read in the manual that the recovery process is constant and runs all the > time. so the question now is > how many wals can this procedure handle? for example can it handle 100-200G sure, if the master can handle that it's no problem for the client (same hardware). In my experience it's only a fraction of work for the client (streaming replication with 9.0) > every day? if it cannot, any other suggestions for HA ?thx in advance Depends on your requirements, for instance heartbeat and DRBD is an other solution. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
thx a lot for your answer. actually DRBD is the solution i am trying to avoid, since i think the performance is degrading a lot (i ve used it in the past). and also i have serious doubts if the data is corrupted in case of the master's failure, if not all blocks have been replicated to they secondary. has anyone faced this situation? any comments on that? thx in advance -- View this message in context: http://postgresql.1045698.n5.nabble.com/warm-standby-apply-wal-archives-tp4770567p4771295.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On September 5, 2011, MirrorX <mirrorx@gmail.com> wrote:
> thx a lot for your answer.
>
> actually DRBD is the solution i am trying to avoid, since i think the
> performance is degrading a lot (i ve used it in the past). and also i
> have serious doubts if the data is corrupted in case of the master's
> failure, if not all blocks have been replicated to they secondary. has
> anyone faced this situation? any comments on that? thx in advance
>
DRBD mode C is very good. If you're running mode C, when PostgreSQL issues an fsync, that doesn't return until the secondary node has the data on disk. It's as safe as you're going to get.
The performance limit for DRBD is the write speed of a single network interface. If you're exceeding that, though, you also aren't going to be shipping out WAL segments in real time. I guess also if your nodes aren't close by, the latency could be a speed killer, but that's not really the normal use case.
the nodes communicate through 4Gbps ethernet so i dont think there is an issue there. probably some kind of misconfiguration of DRBD has occured. i will check on that tommorow. thx a lot :) -- View this message in context: http://postgresql.1045698.n5.nabble.com/warm-standby-apply-wal-archives-tp4770567p4772126.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
the nodes communicate through 4Gbps ethernet so i dont think there is an
issue there. probably some kind of misconfiguration of DRBD has occured. i
will check on that tommorow. thx a lot :)
--
View this message in context: http://postgresql.1045698.n5.nabble.com/warm-standby-apply-wal-archives-tp4770567p4772126.htmlSent from the PostgreSQL - general mailing list archive at Nabble.com.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
The network bandwidth between the servers is definitely not an issue. What is bothering me is the big size of the wal archives, which goes up to 200GB per day and if the standby server will be able to replay all these files. The argument that; since the master can do it and also do various other tasks at the same time, and since the secondary is identical to the first, so he should be able to do that seems valid, so i will give it a try and let you know about the results. In the meantime if there are any other ideas/suggestions etc please let me know. thx to all -- View this message in context: http://postgresql.1045698.n5.nabble.com/warm-standby-apply-wal-archives-tp4770567p4773498.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
The network bandwidth between the servers is definitely not an issue. What is
bothering me is the big size of the wal archives, which goes up to 200GB per
day and if the standby server will be able to replay all these files. The
argument that; since the master can do it and also do various other tasks at
the same time, and since the secondary is identical to the first, so he
should be able to do that seems valid, so i will give it a try and let you
know about the results. In the meantime if there are any other
ideas/suggestions etc please let me know. thx to all
--
View this message in context: http://postgresql.1045698.n5.nabble.com/warm-standby-apply-wal-archives-tp4770567p4773498.htmlSent from the PostgreSQL - general mailing list archive at Nabble.com.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
the network transfer does not bother me for now. i will first try to do the whole procedure without compression, so as not to waste any cpu util and time for compressing and decompressing. through the 4Gbps ethernet, the 200GB of the day can be transferred in a matter of minutes. so i will try it and get back with the results. thx to all -- View this message in context: http://postgresql.1045698.n5.nabble.com/warm-standby-apply-wal-archives-tp4770567p4773807.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
just an update from my tests i restored from the backup. the db is about 2.5TB and the wal archives were about 300GB. the recovery of the db was completed after 3 hours. thx to all for your help -- View this message in context: http://postgresql.1045698.n5.nabble.com/warm-standby-apply-wal-archives-tp4770567p4799786.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
just another update since the system is up and running and one more question :p the secondary server is able to restore the wal archives practically immediately after they arrive. i have set a rsync cron job to send the new wals every 5 minutes. the procedure to transfer the files and to restore them takes about 30 seconds (the number of archives is about 20-30). i ve tried to set it to 2 minutes, and then the procedure takes about 20 seconds (both transfer and restoration) while i didnt notice any impact on the primary server (the procedure is initiated on the secondary server). what is your opinion about the time interval that the cron job should run? i ve read many articles online indicating that rsync should not run every 1 minute, but in my case isn't it different since it just syncs two folder containing only wals and not the whole disks? plus both folders on the servers are in different partitions. thx in advance for your insight -- View this message in context: http://postgresql.1045698.n5.nabble.com/warm-standby-apply-wal-archives-tp4770567p4813659.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
just another update since the system is up and running and one more question
:p
the secondary server is able to restore the wal archives practically
immediately after they arrive. i have set a rsync cron job to send the new
wals every 5 minutes. the procedure to transfer the files and to restore
them takes about 30 seconds (the number of archives is about 20-30). i ve
tried to set it to 2 minutes, and then the procedure takes about 20 seconds
(both transfer and restoration) while i didnt notice any impact on the
primary server (the procedure is initiated on the secondary server). what is
your opinion about the time interval that the cron job should run? i ve
read many articles online indicating that rsync should not run every 1
minute, but in my case isn't it different since it just syncs two folder
containing only wals and not the whole disks? plus both folders on the
servers are in different partitions.
thx in advance for your insight
--
View this message in context: http://postgresql.1045698.n5.nabble.com/warm-standby-apply-wal-archives-tp4770567p4813659.htmlSent from the PostgreSQL - general mailing list archive at Nabble.com.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general