Thread: WAL log shipping + Streaming replication PG 9.0 questions

WAL log shipping + Streaming replication PG 9.0 questions

From
Isabella Ghiurea
Date:
Hi All,
I have some questions regarding hot standby  + streaming replication (PG
9.0) : what is   master/primary  server  behavior if  archive_command
fails  due to  archive  directory off line/unavailable?
How about standby server  behavior if restore command fails due to
archive_master directory not available?
Thank you
Isabella

--
-----------------------------------------------------------



transaction time and log analysis

From
Kasia Tuszynska
Date:
Hi Everybody,
Two questions:

1. Is there anything I could do to in postgresql.conf file to speed up the time it takes postgres to commit a
transaction?
2. Is there anything better than pg_fouine to analyze logs?

Thank you,
Sincerely,
Kasia

Re: transaction time and log analysis

From
Tom Lane
Date:
Kasia Tuszynska <ktuszynska@esri.com> writes:
> 1. Is there anything I could do to in postgresql.conf file to speed up the time it takes postgres to commit a
transaction?

If you want the traditional guarantee that "commit" means "your data is
safely down on persistent storage", there's not much you can do except
buy better storage hardware.  A disk controller card with battery-backed
cache can be a relatively inexpensive solution.  Try the pgsql-performance
archives for more information than you need on that topic.

If you're willing to live with weaker guarantees, consider disabling
synchronous_commit.

> 2. Is there anything better than pg_fouine to analyze logs?

No idea on that one.

            regards, tom lane

Re: WAL log shipping + Streaming replication PG 9.0 questions

From
Fujii Masao
Date:
On Tue, Oct 19, 2010 at 2:48 AM, Isabella Ghiurea
<isabella.ghiurea@nrc-cnrc.gc.ca> wrote:
> I have some questions regarding hot standby  + streaming replication (PG
> 9.0) : what is   master/primary  server  behavior if  archive_command fails
>  due to  archive  directory off line/unavailable?

Unarchived WAL files continue to accumulate in pg_xlog directory while
the archive directory is unavailable. Then, if pg_xlog directory is
filled up, the PostgreSQL server causes the PANIC error.

> How about standby server  behavior if restore command fails due to
> archive_master directory not available?

After restore_command fails, the standby tries to read WAL file from
its pg_xlog directory. And, if there is no WAL file to recover in pg_xlog,
the standby tries to connect to the master and read the streamed WAL files
from the master.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: WAL log shipping + Streaming replication PG 9.0 questions

From
Isabella Ghiurea
Date:
Fujii Masao wrote:
> On Tue, Oct 19, 2010 at 2:48 AM, Isabella Ghiurea
> <isabella.ghiurea@nrc-cnrc.gc.ca> wrote:
>
>> I have some questions regarding hot standby  + streaming replication (PG
>> 9.0) : what is   master/primary  server  behavior if  archive_command fails
>>  due to  archive  directory off line/unavailable?
>>
>
> Unarchived WAL files continue to accumulate in pg_xlog directory while
> the archive directory is unavailable. Then, if pg_xlog directory is
> filled up, the PostgreSQL server causes the PANIC error.
>
>
>> How about standby server  behavior if restore command fails due to
>> archive_master directory not available?
>>
>
> After restore_command fails, the standby tries to read WAL file from
> its pg_xlog directory. And, if there is no WAL file to recover in pg_xlog,
> the standby tries to connect to the master and read the streamed WAL files
> from the master.
>
> Regards,
>
>
Thank you for all details.
 For the last question I ' m still a bit confused: if master_archive off
line ( no archived WAL's available) what will be state of  standby server ?
Basically we are trying to design a solution where WAL archived
directory is not a "single point of failure"   for whole system ( we are
planing to have WAL archived master directory on a separate machine and
both server primary and standby will have read/write here) if this
servers fails , we  want  to  be able to continue normal user operation
on master while standby server  will have some latency.
Isabella


--
-----------------------------------------------------------
Isabella A. Ghiurea

Isabella.Ghiurea@nrc-cnrc.gc.ca
Canadian Astronomy Data Centre |http://www.nrc-cnrc.gc.ca/eng/services/hia/data-centre.html
National Research Council of Canada, Herzberg Institute of Astrophysics
5071 West Saanich Road, Victoria BC V9E 2E7, Canada
Phone: 250 363-3446 fax: 250 363-0045


Re: WAL log shipping + Streaming replication PG 9.0 questions

From
Fujii Masao
Date:
On Tue, Oct 19, 2010 at 11:46 PM, Isabella Ghiurea
<isabella.ghiurea@nrc-cnrc.gc.ca> wrote:
> For the last question I ' m still a bit confused: if master_archive off line
> ( no archived WAL's available) what will be state of  standby server ?
> Basically we are trying to design a solution where WAL archived directory is
> not a "single point of failure"   for whole system ( we are planing to have
> WAL archived master directory on a separate machine and both server primary
> and standby will have read/write here) if this servers fails , we  want  to
>  be able to continue normal user operation on master while standby server
>  will have some latency.

You have three machines, the first is the master, the second is the standby,
and the last is something like NAS which has the archive area shared by them.
Right?

Even if the archive is unavailable, we can continue normal user operation
like INSERT/UPDATE/DELETE on the master until its pg_xlog has run out of
disk space (As I said before, a failure of archive_command continues to
accumulate unarchived WAL files in pg_xlog). So, you would need to check
periodically whether the archive is available or not by using a monitoring
tool, in order to prevent pg_xlog from running out of disk space.

On the other hand, even if the archive is unavailable, if the replication
connection is alive, the standby can receive the WAL and catch up with the
master. But, if the replication connection is terminated and the standby
cannot connect to the master because of the network outage, the standby can
read no WAL from neither the master nor the archive. In this case, the
standby would fall behind the master.

To prevent only one network outage from terminating the connection between
the master and the standby, and the connection between the archive and the
standby at the same time, we might need to use different networks for each.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: WAL log shipping + Streaming replication PG 9.0 questions

From
Robert Treat
Date:
On Tue, Oct 19, 2010 at 10:40 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Tue, Oct 19, 2010 at 11:46 PM, Isabella Ghiurea
> For the last question I ' m still a bit confused: if master_archive off line
> ( no archived WAL's available) what will be state of  standby server ?
> Basically we are trying to design a solution where WAL archived directory is
> not a "single point of failure"   for whole system ( we are planing to have
> WAL archived master directory on a separate machine and both server primary
> and standby will have read/write here) if this servers fails , we  want  to
>  be able to continue normal user operation on master while standby server
>  will have some latency.

You have three machines, the first is the master, the second is the standby,
and the last is something like NAS which has the archive area shared by them.
Right?

Even if the archive is unavailable, we can continue normal user operation
like INSERT/UPDATE/DELETE on the master until its pg_xlog has run out of
disk space (As I said before, a failure of archive_command continues to
accumulate unarchived WAL files in pg_xlog). So, you would need to check
periodically whether the archive is available or not by using a monitoring
tool, in order to prevent pg_xlog from running out of disk space.

On the other hand, even if the archive is unavailable, if the replication
connection is alive, the standby can receive the WAL and catch up with the
master. But, if the replication connection is terminated and the standby
cannot connect to the master because of the network outage, the standby can
read no WAL from neither the master nor the archive. In this case, the
standby would fall behind the master.

Just to chime in here, this doesn't necessarily have to "break" your standby; rather it can happily sit around waiting for the next xlog to show up. If you can fix connectivity to the NAS, it will happily start replaying log files and catching up again. 

Robert Treat