Re: replication using WAL archives - Mailing list pgsql-admin

From Simon Riggs
Subject Re: replication using WAL archives
Date
Msg-id 011a01c4b818$b7370a20$06e887d9@Nightingale
Whole thread Raw
In response to replication using WAL archives  ("Iain" <iain@mst.co.jp>)
Responses Re: replication using WAL archives
List pgsql-admin
> Gaetano Mendola wrote
> Postgres can help this process, as suggested by Tom creating a
pg_current_wal()
> or even better having two new GUC parameters: archive_current_wal_command
and
> archive_current_wal_delay.

OK, we can modify the archiver to do this as well as the archive-when-full
functionality. I'd already agreed to do something similar for 8.1

PROPOSAL:
By default, archive_max_delay would be 10 seconds.
By default, archive_current_wal_command is not set.
If archive_current_wal_command is not set, the archiver will archive a file
using archive_command only when the file is full.
If archive_current_wal_command is set, the archiver would archive a file
whichever of these occurs first...
- it is full
- the archive_max_delay timeout occurs (default: disabled)
...as you can see I've renamed archive_current_wal_delay to reflect the fact
that there is an interaction between the current mechanism (only when full)
and this additional mechanism (no longer than X secs between log files).
With that design, if the logs are being created quickly enough, then a
partial log file is never created, only full ones.

When an xlog file is archived because it is full, then it is sent to both
archive_current_wal_command and archive_command (in that order). When the
timeout occurs and we have a partial xlog file, it would only be sent to
archive_current_wal_command. It may also be desirable to not use
archive_command at all, only to use archive_current_wal_command. That's not
currently possible because archive_command is the switch by which all of the
archive functioanlity is enabled, so you can't actually turn this off.

There is already a timeout feature designed into archiver for safety...so I
can make that read the GUCs, above and act accordingly.

There is an unresolved resilience issue: if the archiver goes down (or
whatever does the partial_wal copy functionality) then it it is possible
that users will continue writing to the database and creating xlog records.
It would be up to the user to define how to handle records that had been
committed to the first database in the interim before cutover. It would also
be up to the user to shut down the first node from the second - Shoot the
Other Node in the Head, as its known. All of that is up to the second node,
and as Tom says, is "the hard part"....I'm not proposing to do anything
about that at this stage, since it is implementation dependant.

I was thinking perhaps to move to having variable size xlog files, since
their contents are now variable - no padded records at EOF. If we did that,
then the archiver could simply issue a "switch logfile" and then the
archiver would cut in anyway to copy away the xlog. Having said that it is
lots easier just to put a blind timeout in the archiver and copy the file -
though I'm fairly uneasy about the point that we'd be ignoring the fact that
many people are still writing to it. But I propose doing the easy way....

Thoughts?

= - = - =

Gaetano - skim-reading your script, how do you handle the situation when a
new xlog file has been written within 10 seconds? That way the current file
number will have jumped by 2, so when your script looks for the "Last wal"
using head -1 it will find the N+2 and the intermediate file will never be
copied. Looks like a problem to me...

> I problem I discover during the tests is that if you shut down the spare
node
> and the restore_command is still waiting for a file then the postmaster
will never
> exit  :-(

Hmm....Are you reporting this as a bug for 8.0? It's not on the bug list...

Do we consider that to be desirable or not?

Best Regards, Simon Riggs



pgsql-admin by date:

Previous
From: Anshaj
Date:
Subject: indexes are not working for
Next
From: Robert Treat
Date:
Subject: Re: indexes are not working for