Re: adding another node to our pitr config - Mailing list pgsql-general

From Yaroslav Tykhiy
Subject Re: adding another node to our pitr config
Date
Msg-id 3CE45549-C144-4DBB-BEF5-C89B40178521@barnet.com.au
Whole thread Raw
In response to adding another node to our pitr config  (Geoffrey <lists@serioustechnology.com>)
List pgsql-general
On 06/10/2009, at 11:51 PM, Geoffrey wrote:

> We are currently using WAL shipping to have a hot spare of our
> databases.  We want to add another node to this configuration.  The
> question is, what is the best way to go about this?
>
> Currently, our script checks to see if the WAL file already exists
> on the target server, if not, then we scp the file over.  This is a
> local machine, so the scp overhead is not considered to be an issue.
>
> So, the current approach is:
>
> ssh $1 "test ! -f $2/$4" && scp $3 $1:$2/$4
>
> So, should I simply duplicate that line for the second server and
> place it below this one, or should they be dependent upon each
> other?  That is:
>
>
> archive_command = 'archive.sh node1 /esc/master/pitr %p %f node2'
>
>
> ssh $1 "test ! -f $2/$4" && scp $3 $1:$2/$4 &&
> ssh $5 "test ! -f $2/$4" && scp $3 $5:$2/$4
>
> The second node will not be at the same location, thus the network
> reliability is less.
>
> Thanks for any insights.

I've been interested in a similar setup, too, although I haven't
implemented it yet.  I think there are at least 3 obvious approaches
to consider.  They all are basically solutions/workaround to the
following issue with multiple spare nodes: If the WAL copy operation
to spare node 1 succeeds but that to spare node 2 fails, you have to
handle the partial success somehow.  Your suggested archive_command
will keep returning failure because the WAL segment already exists on
node 1.

1. A shared WAL spool on a fault-tolerant SAN mounted via NFS or
similar by all nodes.  Then you can use a trivial `test && cp && mv'
archive_command on the master node and have all the spare nodes fetch
WAL files from the spool.  (mv is there to make the copy atomic.)  An
additional advantage is that you can use the same WAL spool for
periodic DB backups: You run a file-level backup once in a while and
rely on the spool to accumulate the matching WALs for you.  A problem
with this approach is that it can be non-trivial to implement a truly
fault-tolerant shared spool and you'll end up with a single point of
failure on it.

2. Destructive copy.  Just let your archive_command overwrite existing
WAL segments.  Then a failure with node 2 will result in a retry from
scratch.

3. Delegation of failure handling to archive_command.  Instead of
relying on the pg archiver process to retry archive_command if it
returned failure, run the copy op to each spare node in a loop until
success.  Then you'll be sure all the nodes received the WAL by the
end of the script.

 From a probabilistic PoV, (3) will be notably better than (2) only if
the probability of failure for each node is high.



pgsql-general by date:

Previous
From: Scott Bailey
Date:
Subject: Re: Craeteing sparse arrays
Next
From: Yadira Lizama Mue
Date:
Subject: PGCluster vs CyberCluster