Thread: WAL log shipping + Streaming replication PG 9.0 questions
Hi All, I have some questions regarding hot standby + streaming replication (PG 9.0) : what is master/primary server behavior if archive_command fails due to archive directory off line/unavailable? How about standby server behavior if restore command fails due to archive_master directory not available? Thank you Isabella -- -----------------------------------------------------------
Hi Everybody, Two questions: 1. Is there anything I could do to in postgresql.conf file to speed up the time it takes postgres to commit a transaction? 2. Is there anything better than pg_fouine to analyze logs? Thank you, Sincerely, Kasia
Kasia Tuszynska <ktuszynska@esri.com> writes: > 1. Is there anything I could do to in postgresql.conf file to speed up the time it takes postgres to commit a transaction? If you want the traditional guarantee that "commit" means "your data is safely down on persistent storage", there's not much you can do except buy better storage hardware. A disk controller card with battery-backed cache can be a relatively inexpensive solution. Try the pgsql-performance archives for more information than you need on that topic. If you're willing to live with weaker guarantees, consider disabling synchronous_commit. > 2. Is there anything better than pg_fouine to analyze logs? No idea on that one. regards, tom lane
On Tue, Oct 19, 2010 at 2:48 AM, Isabella Ghiurea <isabella.ghiurea@nrc-cnrc.gc.ca> wrote: > I have some questions regarding hot standby + streaming replication (PG > 9.0) : what is master/primary server behavior if archive_command fails > due to archive directory off line/unavailable? Unarchived WAL files continue to accumulate in pg_xlog directory while the archive directory is unavailable. Then, if pg_xlog directory is filled up, the PostgreSQL server causes the PANIC error. > How about standby server behavior if restore command fails due to > archive_master directory not available? After restore_command fails, the standby tries to read WAL file from its pg_xlog directory. And, if there is no WAL file to recover in pg_xlog, the standby tries to connect to the master and read the streamed WAL files from the master. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Fujii Masao wrote: > On Tue, Oct 19, 2010 at 2:48 AM, Isabella Ghiurea > <isabella.ghiurea@nrc-cnrc.gc.ca> wrote: > >> I have some questions regarding hot standby + streaming replication (PG >> 9.0) : what is master/primary server behavior if archive_command fails >> due to archive directory off line/unavailable? >> > > Unarchived WAL files continue to accumulate in pg_xlog directory while > the archive directory is unavailable. Then, if pg_xlog directory is > filled up, the PostgreSQL server causes the PANIC error. > > >> How about standby server behavior if restore command fails due to >> archive_master directory not available? >> > > After restore_command fails, the standby tries to read WAL file from > its pg_xlog directory. And, if there is no WAL file to recover in pg_xlog, > the standby tries to connect to the master and read the streamed WAL files > from the master. > > Regards, > > Thank you for all details. For the last question I ' m still a bit confused: if master_archive off line ( no archived WAL's available) what will be state of standby server ? Basically we are trying to design a solution where WAL archived directory is not a "single point of failure" for whole system ( we are planing to have WAL archived master directory on a separate machine and both server primary and standby will have read/write here) if this servers fails , we want to be able to continue normal user operation on master while standby server will have some latency. Isabella -- ----------------------------------------------------------- Isabella A. Ghiurea Isabella.Ghiurea@nrc-cnrc.gc.ca Canadian Astronomy Data Centre |http://www.nrc-cnrc.gc.ca/eng/services/hia/data-centre.html National Research Council of Canada, Herzberg Institute of Astrophysics 5071 West Saanich Road, Victoria BC V9E 2E7, Canada Phone: 250 363-3446 fax: 250 363-0045
On Tue, Oct 19, 2010 at 11:46 PM, Isabella Ghiurea <isabella.ghiurea@nrc-cnrc.gc.ca> wrote: > For the last question I ' m still a bit confused: if master_archive off line > ( no archived WAL's available) what will be state of standby server ? > Basically we are trying to design a solution where WAL archived directory is > not a "single point of failure" for whole system ( we are planing to have > WAL archived master directory on a separate machine and both server primary > and standby will have read/write here) if this servers fails , we want to > be able to continue normal user operation on master while standby server > will have some latency. You have three machines, the first is the master, the second is the standby, and the last is something like NAS which has the archive area shared by them. Right? Even if the archive is unavailable, we can continue normal user operation like INSERT/UPDATE/DELETE on the master until its pg_xlog has run out of disk space (As I said before, a failure of archive_command continues to accumulate unarchived WAL files in pg_xlog). So, you would need to check periodically whether the archive is available or not by using a monitoring tool, in order to prevent pg_xlog from running out of disk space. On the other hand, even if the archive is unavailable, if the replication connection is alive, the standby can receive the WAL and catch up with the master. But, if the replication connection is terminated and the standby cannot connect to the master because of the network outage, the standby can read no WAL from neither the master nor the archive. In this case, the standby would fall behind the master. To prevent only one network outage from terminating the connection between the master and the standby, and the connection between the archive and the standby at the same time, we might need to use different networks for each. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Tue, Oct 19, 2010 at 10:40 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Tue, Oct 19, 2010 at 11:46 PM, Isabella Ghiurea<isabella.ghiurea@nrc-cnrc.gc.ca> wrote:> For the last question I ' m still a bit confused: if master_archive off lineYou have three machines, the first is the master, the second is the standby,
> ( no archived WAL's available) what will be state of standby server ?
> Basically we are trying to design a solution where WAL archived directory is
> not a "single point of failure" for whole system ( we are planing to have
> WAL archived master directory on a separate machine and both server primary
> and standby will have read/write here) if this servers fails , we want to
> be able to continue normal user operation on master while standby server
> will have some latency.
and the last is something like NAS which has the archive area shared by them.
Right?
Even if the archive is unavailable, we can continue normal user operation
like INSERT/UPDATE/DELETE on the master until its pg_xlog has run out of
disk space (As I said before, a failure of archive_command continues to
accumulate unarchived WAL files in pg_xlog). So, you would need to check
periodically whether the archive is available or not by using a monitoring
tool, in order to prevent pg_xlog from running out of disk space.
On the other hand, even if the archive is unavailable, if the replication
connection is alive, the standby can receive the WAL and catch up with the
master. But, if the replication connection is terminated and the standby
cannot connect to the master because of the network outage, the standby can
read no WAL from neither the master nor the archive. In this case, the
standby would fall behind the master.
Just to chime in here, this doesn't necessarily have to "break" your standby; rather it can happily sit around waiting for the next xlog to show up. If you can fix connectivity to the NAS, it will happily start replaying log files and catching up again.
Robert Treat
play: http://www.xzilla.net